File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-0503_intro.xml

Size: 6,063 bytes

Last Modified: 2025-10-06 14:01:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0503">
  <Title>Acquisition System for Arabic Noun Morphology</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A morphology system is the backbone of a natural language processing system. No application in this field can survive without a good morphology system to support it. The Arabic language has its own features that are not found in other languages. That is why many researchers have worked in this area. Al-Fedaghi and Al-Anzi (1989) present an algorithm to generate the root and the pattern of a given Arabic word. The main concept in the algorithm is to locate the position of the roots letters in the pattern and examine the letters in the same position in a given word to see whether the tri graph forms a valid Arabic root or not.</Paragraph>
    <Paragraph position="1"> Al-Shalabi (1998) developed a system that removes the longest possible prefix from the word where the three letters of the root must lie somewhere in the first four or five characters of the remainder. Then he generates some combinations and checks each one of them with all the roots in the file. Al-Shalabi reduced the processing, but he discussed this from point of view of verbs not nouns. Anne Roeck and Waleed Al-Fares (2000) developed a clustering algorithm for Arabic words sharing the same verbal root. They used root-based clusters to substitute for dictionaries in indexing for information retrieval. Beesley and Karttunen (2000) described a new technique for constructing finite-state transducers that involves reapplying a regular-expression compiler to its own output. They implemented the system in an algorithm called compilereplace. This technique has proved useful for handling non-concatenate phenomena, and they demonstrate it on Malay full-stem reduplication and Arabic stem inter-digitations.</Paragraph>
    <Paragraph position="2"> Most verbs in the Arabic language follow clear rules that define their morphology and generate their paradigms. Those nouns that are not derived from roots do not seem to follow a similar set of well-defined rules. Instead there are groups showing family resemblances.</Paragraph>
    <Paragraph position="3"> We believe that nouns in Arabic that are not derived from roots are governed not only by phonological rules but by lexical patterns that must be identified and stored for each noun. Like irregular verbs in English their forms are determined by history and etymology, not just phonology. Among many other examples, Pinker (1999) points to the survival of past forms became for become and overcame for overcome, modeled on came for come, while succumb, with the same sound pattern, has a regular past form succumbed. The same kinds of phenomena are especially apparent for proper nouns in Arabic derived from Indian and Persian names. Pinker uses examples like this, as well as emerging research in neurophysiology, to argue for the coexistence of phonological rules and lexical storage of English verb patterns.</Paragraph>
    <Paragraph position="4"> We believe that further work in Arabic computational linguistics requires the development of a pattern bank for nouns. This paper describes the tool that we have built for this purpose. While the set of patterns for common nouns in Arabic may soon be established, newspapers and other dynamic sources of language will always contain new proper names, so we expect our tool to be a permanent part of our system, even though we may need it less often as time goes on.</Paragraph>
    <Paragraph position="5"> 2 Nouns in the Arabic Language A noun in Arabic is a word that indicates a meaning by itself without being connected with the notion of time. There are two main kinds of noun: variable and invariable. Variable nouns have different forms for the singular, the dual, the plural, the diminutive, and the relative.</Paragraph>
    <Paragraph position="6"> Variable nouns are again divided into two kinds: inert and derived. The inert noun is not derived from another word, i.e. it does not refer to a verbal root. Inert nouns are divided into two kinds: concrete nouns (e.g., lion), and abstract nouns (e.g., love). Derived nouns are taken from another word (usually a verb) (e.g. office); they have a root to refer to. A derived noun is usually close to its root in meaning. It indicates, besides the meaning, the concrete thing that caused its formation (case of the agent-noun), or underwent its action (case of the patient-noun), or any other notions of time, place, or instrument. The following are the noun types: A genus noun indicates what is common to every element of the genus without being specific to any one of them. It is the word naming a person, an animal, a thing or an idea.</Paragraph>
    <Paragraph position="7"> Example: r man ba book An agent noun is a derived noun indicating the actor of the verb or its behavior. It has several patterns according to its root.</Paragraph>
    <Paragraph position="8"> Example: srd the person who studies A patient noun is a derived noun indicating the person or thing that undergoes the action of the verb. Patient nouns have several patterns depending in the verbal root. Example: swr the thing that has been studied An instrument noun is a noun indicating the tool of an action. Some instruments are derived; some are inert.</Paragraph>
    <Paragraph position="9"> Example: H key An adjective is considered to be a type of noun in traditional Arabic grammar. It describes the state of the modified noun.</Paragraph>
    <Paragraph position="10"> Example: beautiful Mr.</Paragraph>
    <Paragraph position="11"> Professor a big An adverb is a noun that is not derived and that indicates the place or the time of the action. Example: Month city l north A proper noun is the name of a specific person, place, organization, thing, idea, event, date, time, or other entity. Some of them are solid (inert) nouns some of them are derived [Abuleil and Evens 1998].</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML