File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-3161_intro.xml

Size: 12,155 bytes

Last Modified: 2025-10-06 14:05:11

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-3161">
  <Title>Shalt2 a Symmetric Machine Translation System with</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
PPADJUNCT
</SectionTitle>
    <Paragraph position="0"> drive (CAT n, PREP into, DET def, NUM ag) The reader *nay no*lee that the above sentence should really have ambiguities in prepositional phrase attachment, which result in two conflicting dependencies &amp;quot;insert -PPADJUNCT- drive&amp;quot; and &amp;quot;diskette -PPADJUNCT- drkve&amp;quot; in a single DS. We will discuss the handling of such ambiguities in Section 3.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Concept Definitions
</SectionTitle>
      <Paragraph position="0"> A set of conceptual primitives is called a Natural Language (NL) class system\[hi, and is maintained as an e It 171 object-oriented knowledge base under l~ram K' . The NL class system consists of a set of constant classes, three meta-classes, and virtual classeo with an exclusive inheritance mechanism discussed below.</Paragraph>
      <Paragraph position="1"> Each class represcnts a concept in the real world. A class hms zero or more slots, which describe the fillers it may use to represent a compound concept. NL objects are particular instances of NL classes. There are is-a and parl-of relationships defined over the NL classes.</Paragraph>
      <Paragraph position="2"> For example,</Paragraph>
      <Paragraph position="4"> defines a class *insert with an is-a link to a class *action, and three slots - :agent, :theme, and :goal. The (value ...) facet shows an actual filler of the slot, while the (see ...) facet shows the selectional restrictions on the slot.</Paragraph>
      <Paragraph position="5"> A class inherits each slot definition from its superclass(on), that is, the fillers of the is-a slot, uuless the slot is redefined. A class can have more than one immediate superclass, but we restrict its inheritance to be exclusive rather than general multiple inheritance.</Paragraph>
      <Paragraph position="6"> Tilat is, au instance of a claim c~s inherit slot definitions from only one of its imraediate superclaaqes. The idea behind exclusive inheritance is to realize certain identity of verbal and nominal word senses without mixing the slot definitions of both. For example, most verbs have nominal counterparts in a natural language, such as '~insert&amp;quot; and &amp;quot;insertion.&amp;quot; Such a pair usually shares slot definitions (:agent, :tlmmc, and :goal) and selectional restrictions, except that &amp;quot;insert&amp;quot; inherits tense, aspect, and modality from its &amp;quot;verbal&amp;quot; superclazs but not cardinality, ordinal, and definiteness (that is, the quality of being indefinite or definite) from its &amp;quot;nominal&amp;quot; superclazs, although these features are inherited by &amp;quot;insertion.&amp;quot; The following class definitions  allow every instance of subclasses of *action to inherit slot definitions from either *physical-action or *mentalobject. Exclusive inheritance also contributes to performance improvement of the parser since it allows us to reduce the number of possible superclasses from an exponential number to a linear number.</Paragraph>
      <Paragraph position="7"> There are three recta-classes in NL classes - *vat, *set, and *fun - to represent concepts that are not ineluded in the normal class hierarchy. The first, *vat, is a variable that ranges over a set of NL classes, which are constants. Pronouns and question words in natural languages usually carry this kind of incomplete concept.</Paragraph>
      <Paragraph position="8"> The second, *set, is a set constructor that can represent a coordinated structure in natural languages. The third, *fun, is a function from NL objects to NL objects. It captures the meaning of a so-called sentiofunction word.</Paragraph>
      <Paragraph position="9"> For example, in some usages, the verb &amp;quot;take&amp;quot; does not really indicate any specific action until it gets an argument such as &amp;quot;a walk,&amp;quot; &amp;quot;a rest,&amp;quot; &amp;quot;a look.&amp;quot; It is therefore well characterized as a function.</Paragraph>
      <Paragraph position="10"> Since we allow only exclusive inheritance, the NL class system certainly lacks the ability to organize classe~ front various viewpoints, unlike ordinary multipie inheritance. Virtual classes are therefore introduced to compensate for this inability. I~br example,</Paragraph>
      <Paragraph position="12"> shows two types of virtual classe, *option and *malething. The *option consists of the classes *mathcoprocessor, *hard-disk, and *software. The *malething is a class that includes instances of any class with the :sex slot filled with *male. Note that the maintainability of a class hierarchy drastically declines if we allow classes such as *option to be &amp;quot;actual&amp;quot; rather than virtual, as we will have many is-a links from anything that could be an option. The second type of virtual class helps incorporate an-called semantic features into the NL class system. Existin~ machine-readable dictionaries (for example, LDOCEt el) often have entries with semantic features such as HUMAN, LIQUID, and VF~ HICLE that may not fit into a given ontological class hierarchy. A virtual class definition AcrEs DE COL1NG-92, NAlVtES, 23-28 ^OUr 1992 1 0 3 5 PRec. ol. COLING-92, NAIVrES, Auo. 23-28. 1992 (de:f vclaos *haman (def (equal :haman *true))) with semantic restrictions (:agent (sere *human)) make it possible to integrate such entries into the NL class system.</Paragraph>
      <Paragraph position="13"> The NL class system currently includes a few thousand concepts extracted from the personal-computer domain. The word senses in the SHALT dictionary (about 100,000 entries) and the LDOCE (about 55,000 entries) have been gradually incorporated into the NL class system.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Mapping Rules
</SectionTitle>
      <Paragraph position="0"> Mapping rules define lexlcal and structural correspondences between syntactic and conceptual representations. A lexical mapping rule has the form  where a transitive verb &amp;quot;insert&amp;quot; maps to or from an instance of*insert with its exclusive superclass *physicalaction. The three slots for st,'uctural mapping between concepts (:agent, :theme, and :goal) and grammatical roles (SUBJECT, DOBJECT, and PPADJUNCT) are also defined in this rule. The :agent filler, for example, should be an instance that is mapped from a syntactic SUBJECT of the verb &amp;quot;insert.&amp;quot; The :goal filler must be a prepositional phrase consisting of a noun with the preposition &amp;quot;into.&amp;quot; The fragments of syntactic feature structures following a lexical word or a grammatical function in a mapping rule specify the minimum structures that subsume feature structures of candidate syntactic constituents. These structural mappings are specific to this instance.</Paragraph>
      <Paragraph position="1"> The structural mapping rule  specifies that the conceptual slots :mood and :time map to or from the grammatical roles MOOD and TENSE, respectively. Unlike the structural mapping in a lexical mapping rule, these slot mappings can be inherited by any instance of a subclass of *physical-action. The *insert instance defined above, for example, can inherit these :mood and :time mappings. Given a dependency structure (DS), mapping rules work as distributed constraints on the nodes and arcs in the DS in such a way that a conceptual representation R is au image of the DS iff R is the minimal representation satisfying all the lexical and structural mappings associated with each node and arc. On the other hand, given a conceptual representation K, mapping rules work inversely as constraints on 1% to define a minimal DS that can be mapped to 1%.</Paragraph>
      <Paragraph position="2"> Thus, assuming that lexieal mapping rules are similarly provided for nouns (diskette and drive) and feature values (imp, pros, and so on), we will have the conceptual representation~ ~Conceptual representation of a sentence consists of instances of classes. We use a hyphen and a number following ~ c|a~s name (*insert-l, *imp-l, ...) when it is necessaxy to show instaJlces explicitly. Otherwise, we idtntlfy class na~nes and instance names.</Paragraph>
      <Paragraph position="4"> for tile sample sentence and its DS shown earlier in this section.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Conceptual Paraphrasing Rules
</SectionTitle>
      <Paragraph position="0"> We assume that a source language sentence and its translation into a target language frequently have slightly different conceptual representations. An adjective in English might be a eonjugable verb in translation. These differences result in added/missing information in the corresponding representations. The conceptual paraphrasing rules describe such equivalence and seml-equivalence among conceptual representations. These rules are sensitive to the target language, but not to the source language, since the definition of equivalence among conceptual representations depends on the cultural and pragmatic background of the language in which a translation has to be expressed. An example of a paraphrasing rule is (oquiv (*equal (:agent (*X (:hUm (*V))))  where *Y/*person specifies *Y to be an instance of any subclass of *person, *equal is roughly the verb &amp;quot;be,&amp;quot; humanization is a relation that holds for pairs such ms (*singer, *sing) and (*swimmer, *swim), and sibling holds for two instances of the same class. Intuitively, this rule specifies an equivalence relationship between sentences such as &amp;quot;Tom is a good singer&amp;quot; and &amp;quot;Tom sings well,&amp;quot; as the following bindings hold: (*equal (:mood (*dec)) (:time (*pros))</Paragraph>
      <Paragraph position="2"> All the instances that have no binding in a rule must remain unchanged as the same slot fillers (e.g., *dec and *pros), while some fillers that have bindings in a rule may be missing from a counterpart instance (e.g., *indef and *W above). Note that *good has lexical mappings to the adjective &amp;quot;good&amp;quot; and the adverb &amp;quot;well.&amp;quot;</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.5 Case Base
</SectionTitle>
      <Paragraph position="0"> A case base is a set of DSs with no syntactic/semantic ambiguities. A conceptual representation for a DS can be computed by a top-down algorithm that recursively tries to combine an instance mapped from the root node of a DS with an instance of each of its subtrees. The arc from the node to a subtree deternfiues the conceptual slot name.</Paragraph>
      <Paragraph position="1"> We have already built a case base that includes about 30,000 sentences from the IBM Dictionary of  Computing \[x\]. Selected sentences in the \],DOCE have also been added to the case base. The sentences in the LDOCE define word senses or show sample usages of each entry. Though composed of a limited vocabulary, they are often syntactically/semantically ambiguous and it is time-consuming for users to build the case base completely manually. Therefore, the Shalt2 parser is used to develop the case base. Starting with a small, manually crafted &amp;quot;core&amp;quot; ease base, e~ch new sentence is analyzed and disambiguated by the parser to generate a DS, which is corrected or otodified by the user and then added to the case base. As the size of the case base grows, tim prot)ortion of human corrections/modifications decreases, since the output of the parser becomes more and more accurate. Wise process is called knowledge bootstrapping and is discussed by Nagao \[6\] in more detail. Mapping coustraints, howevcr, are associated with only a part of the case base, because the NL class system and the mapping rules arc not yet complete.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML