File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/w04-1605_concl.xml

Size: 2,875 bytes

Last Modified: 2025-10-06 13:54:20

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-1605">
  <Title>Systematic Verb Stem Generation for Arabic [?]</Title>
  <Section position="7" start_page="0" end_page="0" type="concl">
    <SectionTitle>
7 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper, we have discussed our attempt at imitating the process used by Arabic speakers in generating stems from roots. We formulated a definition of the process, facilitating an encoding of Arabic stems. The encoding represents stems in terms of their components while stillallowing a simple mapping to their final surface forms. A stem's components are a root, morphosemantic and morphosyntactic templates, and any morphophonemic alterations that the stem may have underwent. In doing so, the problem has been reduced to the much smaller task of obtaining stems for the words sub-ject to analysis, and then matching these against the surfaceformsofthepre-analysedstems. Theencoding retains most of the information essential to stem generationandanalysis, allowingustotracethevarioustransformationsthatrootradicalsundergowhen null inflected. Root extractors and morphological analysers can match an input word with a defined verb  stem,thenusetheinformationinthedefinitiontodetermine with certainty the stem's root and morphologicalpattern'smeaning. Theauthorsintendtouse a similar strategy to define stems for Arabic nouns.</Paragraph>
    <Paragraph position="1"> Mapping from words to defined stems is now much easier. The stem generation algorithm here attempts to produce a comprehensive list of all inflected stems. Any verb may be found in this list if some simple conjoin removal rules are first applied. Conjoins are defined here as single letter conjunctions, future or question particles, emphasis affixes, or object pronominal suffixes that agglutinate to a verb stem. Because conjoins may attach to a verb stem in sequence and without causing any morphological alteration, extracting stems from Arabic words becomes similar to extracting stems from English words. In fact, many of the Arabic word analysis approaches reviewed in the introductiontothispaperwouldyieldmoreaccurate results if applied to stem extraction instead of root extraction. It would become possible to use for this purpose conventional linguistic, pattern matching, or algebraic algorithms.</Paragraph>
    <Paragraph position="2"> The dictionary database described here can be used to form the core of a morphological analyser that derives the root of an input word, identifies its stem, and classifies its morphosemantic and morphosyntactic templates. An analyser based on these principles may be used in many useful applications, some of which are detailed in Yaghi (2004). Example applications include root, lemma based, and exactwordanalysis, searching, incrementalsearching, and concordancing.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML