File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-3208_concl.xml

Size: 1,630 bytes

Last Modified: 2025-10-06 13:55:47

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3208">
  <Title>Morphology Induction from Limited Noisy Data Using Approximate String Matching</Title>
  <Section position="7" start_page="67" end_page="67" type="concl">
    <SectionTitle>
7 Conclusion and Future Work
</SectionTitle>
    <Paragraph position="0"> We presented a framework for morphology induction from noisy data, that is especially useful for languages which have limited electronic data. We use the information in dictionaries, specifically head-word and the corresponding example of usage sentences, to acquire affix lists of the language. We presented results on two data sets and demonstrated that our framework successfully finds the prefixes, suffixes, circumfixes, and infixes. We also used the acquired suffix list from one data set in a simple word segmentation process, and outperformed a state-of-the-art morphology learner using the same amount of training data.</Paragraph>
    <Paragraph position="1"> At this point we are only using headword and corresponding example of usage pairs. Dictionaries provide much more information. We plan to make use of other information, such as POS, to categorize the acquired affixes. We will also investigate how using all the words in example of usages and splitting the compound affixes in agglunative languages can help us to increase the confidence of correct affixes, and decrease the number of invalid affixes.</Paragraph>
    <Paragraph position="2"> Finally we will work on identifying morphophonemic rules (especially stem-interval vowel shifts and point-of-affixation stem changes).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML