XML Viewer - j92-3001

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/j92-3001_metho.xml
Size: 40,234 bytes
Last Modified: 2025-10-06 14:13:11
<?xml version="1.0" standalone="yes"?>
<Paper uid="J92-3001">
  <Title>Making DATR Work for Speech: Lexicon Compilation in SUNDIAL</Title>
  <Section position="3" start_page="247" end_page="254" type="metho">
    <SectionTitle>
3. Encoding Linguistic Knowledge
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="247" end_page="249" type="sub_section">
      <SectionTitle>
3.1 DATR
</SectionTitle>
      <Paragraph position="0"> DATR is a declarative language for representing inheritance networks that support multiple default inheritance. Knowledge is expressed in DATR in terms of path equa- null DIALEX lexicon compilation architecture.</Paragraph>
      <Paragraph position="1"> tions. The syntax of paths is a superset of that found in the PATR-II language (Shieber 1986). For example, (1) identifies two different paths in the DAG rooted at Node1 in an inheritance network.</Paragraph>
      <Paragraph position="2">  (1) Node1: &lt;syn head case&gt; Node1: &lt;syn head number&gt; The path equations we present in this paper take the following forms: (2) a. Nodel: &lt;&gt; == Node2 b. Nodel: Path1 == Value1 c. Nodel: Path1 == &amp;quot;Path2&amp;quot; d. Nodel: Pathl == Node2:Path2 e. Nodel: Pathl == Node2:&lt;&gt;  The form shown in (2a) is the special case in which the path at Node1 is empty. This allows Node1 to inherit all equations available at Node2, except those incompatible with equations at Node 1. Two equations are incompatible if they both make assignments to the same path. The form shown in (2b) is used to assign values to paths, e.g. &lt;syn head number&gt; =-- sg. Alternatively, a value may be copied from elsewhere in the DAG. (2c) is used to assign to Path1 whatever value is found for Path2 at the original query node. The double quotes are significant here because they indicate that Path2 must be evaluated globally. If the quotes were not present, Path1 would be evaluated locally  Computational Linguistics Volume 18, Number 3 and assigned the value of Path2 at Nodel if such a value existed. The form shown in (2d) assigns to Node l:Pathl whatever value is found at Node2:Path2. A special case of this is (2e), which allows extensions of Pathl to be specified at Node2. For example, evaluating the DATR theory in (3) yields the theorems for node Ex2 shown in (4).  (3) Exl: &lt;head major&gt; == n &lt;head case&gt; == nom.</Paragraph>
      <Paragraph position="3"> (4)</Paragraph>
    </Section>
    <Section position="2" start_page="249" end_page="250" type="sub_section">
      <SectionTitle>
3.2 The Linguistic Framework
</SectionTitle>
      <Paragraph position="0"> Linguistic knowledge is structured in terms of a simple unification categorial grammar (Calder et al. 1988) in which featural constraints at the levels of morphology, syntax, and semantics may all occur in a lexical sign. The basic sign structure of lexical entries is shown in (5).</Paragraph>
      <Paragraph position="2"> syntax: \[...\] semantics : \[...\] The basic sign structure of the syntax feature value is shown in (6). (6) s ntax\[ ea :llar sl The head feature includes attribute-value structures for such things as tense, person, number, and definiteness. The args feature is stack-valued, with stack position determining the order in which arguments may be combined by functional application. The basic sign structure of the semantics feature value is shown in (7).</Paragraph>
      <Paragraph position="4"> role, :\[...\] Each semantic object has a unique index (id). The type feature locates the object in a sortal hierarchy. The modus feature specifies a number of constraints imposed on the interpretation of semantic objects, such as polarity, aspect, and tense. Semantic roles (such as theagent, thetime, theinstrument) are specified within the inheritance-based definitions for semantic types.</Paragraph>
      <Paragraph position="5"> The signs are defined in terms of a dual-component DATR lexicon. The base definitions represent an application-independent account of morphosyntax, transitivity, and lexico-semantic constraints. They define what can be thought of as most of the higher nodes of an inheritance hierarchy. The base definitions as a whole are, of course, language specific, although significant exchange of definitions has been possible during  Frangois Andry et al. Making DATR Work for Speech the parallel development of our English and French DATR theories. The application-specific lexicon can be thought of as a collection of lower nodes that hook onto the bottom of the hierarchy defined by the structured base definitions. Whereas the structured base definitions provide a general morphological, syntactic, and lexico-semantic account of a language fragment, the application-specific lexicon provides a vocabulary and a task-related lexical semantics. Ideally, a change of application should only necessitate a change of an application-specific lexicon. Naturally, application-specific lexicons take much less time to construct than the base lexicon. Much of our discussion in the rest of this section will focus on the structured base definitions.</Paragraph>
    </Section>
    <Section position="3" start_page="250" end_page="251" type="sub_section">
      <SectionTitle>
3.3 Morphosyntax
</SectionTitle>
      <Paragraph position="0"> Since the requirements of speech processing in real time rule out online morphological parsing, a full-form lexicon must be produced. However, the task of entering all possible forms of a word into the lexicon by hand would be both time consuming and repetitive. We therefore provide a subtheory of morphology in the DATR base definitions so that the grammar writer need only specify exceptional morphology for each lexeme, leaving the lexicon generator to expand out all of the regular forms.</Paragraph>
      <Paragraph position="1"> The surface form of an English verb encodes information relating to finiteness, tense, number, and person. What is required in the DATR theory is a number of condition-action statements that say things like: IF a verb is finite  THEN IF it is present tense AND singular AND third per-son null THEN its form is &lt;root&gt;+s ELSE its form is &lt;root&gt;.</Paragraph>
      <Paragraph position="2"> The desired effect is achieved by means of DATR's evaluable paths. The following path equation is included at the VERB node.</Paragraph>
      <Paragraph position="3"> (8) VERB: &lt;mor form&gt; == VERB_MOR:&lt;&gt; The VERB_MOR node looks like this: (9) VERB340R: &lt;bse&gt; == &amp;quot;&lt;mor root&gt;&amp;quot; &lt;prp&gt; == (&amp;quot;&lt;mor root&gt;&amp;quot; ing) &lt;psp&gt; == (&amp;quot;&lt;mor root&gt;&amp;quot; ed) &lt;fin&gt; == &amp;quot;&lt; &amp;quot;&lt;syn head tense&gt;&amp;quot; &amp;quot;&lt;syn head number&gt;&amp;quot; &amp;quot;&lt;syn head person&gt;&amp;quot; &gt;&amp;quot; &lt;pres&gt; == &amp;quot;&lt;mor root&gt;&amp;quot; &lt;pres sg third&gt; == (&amp;quot;&lt;mor root&gt;&amp;quot; s) &lt;past&gt; == &amp;quot;&lt;mor form psp&gt;&amp;quot;.</Paragraph>
      <Paragraph position="4">  Computational Linguistics Volume 18, Number 3 The base, present participle, and past participle forms are immediately available. If the verb is finite it is necessary to construct an evaluable path consisting of tense, number, and person values. If the tense is past (last line), the form is copied from the form of the past participle. If the form is present singular third person (second last line), the form is &lt;root&gt; +s. Otherwise, the present tense form is copied from the root form. Exceptional forms are stated explicitly, thus overriding default forms. For example, the following entry for hear specifies that it has exceptional past forms.  (10) HEAR: &lt;&gt; == VERB &lt;mor root&gt; == hear &lt;mor form psp&gt; == heard.</Paragraph>
      <Paragraph position="5">  The evaluable path mechanism is also used to set the value of an agreement feature agr to tps (third person singular) or not_tps. The path equation shown in (11), augmented by the information at the V_AGREE node (12) then requires subject and verb to share the same agr feature value. The subject's agr feature is set by the definitions in (13). 3  (11) VERB: &lt;syn args gr_subject syn head agr&gt; == V_AGREE:&lt; &amp;quot;&lt;syn head tense&gt; .... &lt;syn head number&gt;&amp;quot; &amp;quot;&lt;syn head person&gt;&amp;quot; &gt;.</Paragraph>
      <Paragraph position="6"> (12) V_AGREE: &lt;pres&gt; == not_tps &lt;pres sg third&gt; == tps.</Paragraph>
      <Paragraph position="7"> (13) NOUN: &lt;syn head agr&gt; ==  N_AGREE:&lt;agr &amp;quot;&lt;syn head number&gt;&amp;quot; &amp;quot;&lt;syn head person&gt;&amp;quot;&gt;. N_AGREE: &lt;agr&gt; == not~ps &lt;agr sg third&gt; == tps.</Paragraph>
      <Paragraph position="8">  English verb morphology presents no real problems; noun morphology is even simpler. French morphology is rather more complicated. However, it can be handled by means of the same general technique of allowing evaluable paths to act as case statements that select the appropriate morphological form. Instead of a unified account of French verb morphology there are a number of distinct inflectional paradigms from which different verbs inherit. A more sophisticated account of subject-verb agreement is also required.</Paragraph>
    </Section>
    <Section position="4" start_page="251" end_page="254" type="sub_section">
      <SectionTitle>
3.4 Transitivity
</SectionTitle>
      <Paragraph position="0"> Consider the relationship between verbs of different transitivity. An intransitive verb takes a subject only. A transitive verb takes a subject and an object. A ditransitive verb takes a subject and two objects, one direct and the other indirect. This information  Francois Andry et al. Making DATR Work for Speech is easily expressible in terms of an inheritance hierarchy. Facts about subjects are associated with a top node, for example a node called VERB. Facts about direct objects are associated with another node, for example, a node called TRANS_V. By saying that TRANS_V is a VERB, the general information about subjects is inherited at the TRANS_V node. This relationship can be expressed simply in DATR. A similar treatment can be adopted for ditransitive verbs (DTRANS_V):  (14) VERB: &lt;syn head major&gt; == v &lt;syn args gr_subject&gt; == GR_SUBJECT:&lt;&gt;.</Paragraph>
      <Paragraph position="1"> TRANS_V: &lt;&gt; == VERB &lt;syn args gr_direct&gt; == GR_DIRECT:&lt;&gt;.</Paragraph>
      <Paragraph position="2"> DTRANS_V: &lt;&gt; == TRANS_V &lt;syn args gr_indirect&gt; == GR_INDIRECT:&lt;&gt;.</Paragraph>
      <Paragraph position="3">  Entries of the form &lt;syn args gr_subject&gt; == GR_SUBJECT:&lt;&gt; represent a convenient way of packaging up all information relating to an argument type at a single node (part of the information stored at this node can be found in (18) below; notice that different arguments are identified by unique labels such as gr_subj ect and gr_direct). We have already noted that in our sign representation, arguments are distinguished by their position in a stack. This ought to render unique argument labels superfluous. In fact, there are a number of reasons why it is desirable to use unique labels in the DATR theory. Firstly, they allow individual arguments of a word to be picked out (see Section 3.4.1 below). Secondly, they allow classes of argument to be identified and generalizations to be made where appropriate. For example, we show in Section 3.4.2 how order and optionality generalizations can be made over argument types, and how a system organized around named arguments can be mapped within DATR into an order-marked system. Finally, grammatical relation labels are much easier for grammar writers to remember and manipulate than positionally encoded argument structures. Consider the following partial DATR entry for English infinitival complement verbs.</Paragraph>
      <Paragraph position="4"> (15) INF_COMP_V: &lt;&gt; == VERB &lt;syn args gr_comp&gt; == GR_COMP:&lt;&gt; &lt;syn args gr_comp syn args gr_subject&gt; == &amp;quot;Ksyn args gr_subject&gt;&amp;quot; The first line states that an infinitival complement verb inherits from the VERB node, i.e., it is a verb that must have a subject. The second line introduces a number of constraints on the complement. These constraints-----collected at the GR_COMP node-include the fact that the complement must be the infinitive form of a verb. The next line enables the complement to share the subject of the matrix verb, i.e., in a sentence like Amy wants to fly, Amy is the subject of both want and fly.</Paragraph>
      <Paragraph position="5">  a verb and the semantics of its subject. The semantics of the subject must be coindexed with a semantic role of the verb such as theagent, as shown in (16).</Paragraph>
      <Paragraph position="6"> (16) \[syn:\[args:\[gr~ubject:\[sem:Asem: \[ theagent:A \] \] \] \] \] This reentrancy can be expressed in DATR as follows: (17) &lt;sem theagent&gt; == &amp;quot;&lt;syn args gr_subject sere&gt;&amp;quot;.</Paragraph>
      <Paragraph position="7"> The argument labeled gr_subj ect is typically underspecified in the lexicon and awaits full specification at parse time. Because of this, the constraint is carried over to the DAG-encoding phase of lexicon compilation, where it becomes a reentrancy, as described in Section 4.</Paragraph>
      <Paragraph position="8">  initions are identified by grammatical relation labels, such as gr_subj ect, the lexicon generation process requires arguments encoding order and optionality constraints that are identified by relative position in an args list. Two techniques are used to produce DATR theories with arguments structured in this way.</Paragraph>
      <Paragraph position="9"> The first technique is to define featural constraints of order and optionality for each grammatical relation label. Three types of constraint are defined: din indicating whether the argument precedes or follows the functor (pre or post); adj: indicating whether the argument is adjacent to the functor or not (next or any); and opt: indicating whether the argument is optional or obligatory (opt or oblig). Arguments identified as gr_subject and gr_oblique, for example, inherit the following ordering constraints:</Paragraph>
      <Paragraph position="11"> Whereas the subject is obligatory, and precedes the functor and allows for intervening constituents, the oblique argument is optional and may appear in any position following the functor.</Paragraph>
      <Paragraph position="12"> The second technique maps arguments identified by relation labels onto arguments identified by position in a linked list. Relative position is encoded in terms of the  Franqois Andry et al. Making DATR Work for Speech features first and rest: first identifies the first argument in a list, and rest identifies the linked list of remaining arguments.</Paragraph>
      <Paragraph position="13"> Consider part of the base definition for transitive verbs, as shown in (19). (19) TRANS_V: &lt;&gt; == VERB &lt;syn args gr_direct&gt; == GR_DIRECT:&lt;&gt; &lt;syn args&gt; == TVARGS:&lt;&gt;.</Paragraph>
      <Paragraph position="14"> Part of the collection of nodes devoted to mapping named arguments onto order-marked arguments is shown in (20).</Paragraph>
      <Paragraph position="15"> (20) TVARGS: &lt;&gt; == DTVARGS &lt;rest&gt; == DARGS:&lt;&gt;.</Paragraph>
      <Paragraph position="16"> DTVARGS: &lt;first&gt; == &amp;quot;&lt;syn args gr_subject&gt;&amp;quot; &lt;rest&gt; == ARGSI:&lt;&gt;.</Paragraph>
      <Paragraph position="17"> DARGS: &lt;first&gt; == &amp;quot;&lt;syn args gr_direct&gt;&amp;quot; &lt;rest&gt; == ARGS3:&lt;&gt;.</Paragraph>
      <Paragraph position="18"> ARGS3: &lt;first&gt; == &amp;quot;&lt;syn args obliquel&gt;&amp;quot;</Paragraph>
      <Paragraph position="20"> TVARGS inherits the value of &lt;first&gt; from DTVARGS, which finds it by evaluating the path &amp;quot;&lt;syn args gr_subject&gt;.&amp;quot; The &lt;rest&gt; is then inherited from DTVARGS where the &lt;first&gt; argument of &lt;rest&gt; inherits from &amp;quot;&lt;syn args gr_direct&gt;.&amp;quot; The &lt;rest&gt; of &lt;rest&gt; then inherits from ARGS3, which specifies the position of oblique arguments within the args list of transitive verbs.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="254" end_page="260" type="metho">
    <SectionTitle>
3.5 Lexico-Semantic Constraints
</SectionTitle>
    <Paragraph position="0"> A word lattice is likely to include numerous semantically anomalous but syntactically well-formed constructions. In a system that aims toward real time speech understanding it is vital that semantic selectional restrictions be introduced as early as possible in order to eliminate false phrasal hypotheses at an early stage.</Paragraph>
    <Paragraph position="1"> Selectional restrictions are typically associated with lexemes. Each content word has a semantic type, and many words specify the semantic types of their arguments.</Paragraph>
    <Paragraph position="2"> For example, the semantic type of tell is inform and the type of its role theexperiencer (associated with the indirect object) is almost always human in our trial domain. This can be expressed as follows.</Paragraph>
    <Paragraph position="3">  (21) TELL: &lt;&gt; == DTRANS_V &lt;mor root&gt; == tell &lt;sem type&gt; == inform &lt;sem theexperiencer type&gt; == human.</Paragraph>
    <Paragraph position="4">  Computational Linguistics Volume 18, Number 3 Certain argument types can be assigned default semantic types. For example, by default the semantic type of subjects must be sentient (a superclass of human). This works for a large majority of verbs. Of course, defaults of this kind can be overridden for individual lexemes (such as the verb rain) or word classes (such as copular verbs).</Paragraph>
    <Section position="1" start_page="255" end_page="258" type="sub_section">
      <SectionTitle>
3.6 Example: The French Noun Phrase
</SectionTitle>
      <Paragraph position="0"> By way of example, we show how two entries from the French SUNDIAL lexicon, the determiner le ('the.MASC') and the common noun passager ('passenger'), are encoded in DATR; to put the following section in context, we also show the DIALEX output.</Paragraph>
      <Paragraph position="1"> In a simple French noun phrase, we treat the common noun as the head, with the determiner as an optional argument. Therefore, most of the information is associated with the common noun.</Paragraph>
      <Paragraph position="2"> A common noun inherits from the node NOUN:  (23) WORD: &lt;mor form&gt; == &amp;quot;&lt;mor root&gt;&amp;quot;.</Paragraph>
      <Paragraph position="3"> Syntactic and semantic default values such as category (n), gender (mast), case (nom), and semantic type (entity) are given at the NOUN node. Some of these values may be overridden, for example in the definition of passager: (24) Passager: &lt;&gt; == NOUN  Inheritance graph for French common nouns.</Paragraph>
      <Paragraph position="4"> present, whereas the gender of the determiner is copied from the common noun. Where a feature value is already specified for both noun and determiner at parse time, the values must be the same if combination is to take place. The definitions for GR_DETERMINER and NOUNARGS are shown in (25): (25) GR_DETERMINER: &lt;syn head major&gt; == det</Paragraph>
      <Paragraph position="6"> The definition of GR_DETERMINER specifies order and optionality information as well as the syntactic category (det). NOUNARGS defines the mapping of case-marked to order-marked arguments, for simple determiner-noun NPs.</Paragraph>
      <Paragraph position="7"> The inheritance graph for this set of DATR sentences is illustrated in Figure 2.</Paragraph>
      <Paragraph position="8"> In fact, common nouns may be more complex than our example suggests; they may have several obliques, for example. Fortunately, DATR allows the creation of intermediate nodes between the NOUN node and the common nouns, and these nodes specify distinctive properties of each distinct class of nouns. For example, a RELDAY node has been created for French in order to describe common grammatical properties for relative day references such as lendemain ('tomorrow') and veille ('the day before'). In the same spirit, NPs with genitive postmodifiers such as le numero du vol ('the number of the flight/the flight number'), where two nouns are combined, use the node GNOUN, which specifies general features of the arguments of the head noun.</Paragraph>
      <Paragraph position="9"> The definition of the determiner node, DET, is simple in comparison with the NOUN node, inheriting only from the WORD node. Example (26) shows the definition of DET,  Computational Linguistics Volume 18, Number 3</Paragraph>
      <Paragraph position="11"> together with entries for le and la ('the.FEM').</Paragraph>
      <Paragraph position="12"> (26) DET: &lt;&gt; == WORD &lt;syn head major&gt; == det &lt;syn head def&gt; == the &lt;syn head number&gt; == sg &lt;syn head gender&gt; == masc &lt;sem modus def&gt; == the.</Paragraph>
      <Paragraph position="13"> le: &lt;&gt; == DET &lt;mor root&gt; == le.</Paragraph>
      <Paragraph position="14"> la: &lt;&gt; == DET &lt;mor root&gt; == la &lt;syn head gender&gt; == fem.  The lexical entries for le and passager produced by the DAG-encoding phase of compilation (see Section 4) are shown in Figure 3.  Francois Andry et al. Making DATR Work for Speech 4. Lexicon Generation</Paragraph>
    </Section>
    <Section position="2" start_page="258" end_page="259" type="sub_section">
      <SectionTitle>
4.1 Obtaining the DNF Lexicon
</SectionTitle>
      <Paragraph position="0"> In order to generalize across morphological instantiations, a DATR theory makes use of nodes at the level of the lexeme. In general, the constraints in a lexeme cannot be simply represented as a union of paths. This is due to the fact that the sentences making up the definition of a lexeme for which morphosyntactic variations exist implicitly contain disjunctions. Because we require the lexicon to be disjoint, our strategy is to cash out all embedded disjunctions that reference each surface form. The lexicon thus obtained can be described as being in disjunctive normal form (DNF). This DNF-lexicon will contain all lexical signs, where a lexical sign incorporates both the surface form and the corresponding lexeme.</Paragraph>
      <Paragraph position="1"> In order to govern the expansion of information in the DATR lexicon, it is necessary to make a closed world of the feature space defined there. The values that features may take may be implicit in the DATR lexicon; however such implicit knowledge is not necessarily complete. Nothing prevents arbitrary extension of the lexicon by additional features and values, and this may lead to unwanted interactions with existing rules.</Paragraph>
      <Paragraph position="2"> We therefore enumerate the possible values of features in a knowledge base known as the closure definitions. This enumeration is designed to be recursive, to take into account category-valued features such as the args list. Figure 4 gives an example of closure definitions, for a sign with only syn and mor attributes. These state the features that make up a sign; the definition is recursively introduced at the level of &lt;syn args&gt;.</Paragraph>
      <Paragraph position="3"> A closure definition takes the form: cdef (Feature, Fields, FieldVals, FCRs ).</Paragraph>
      <Paragraph position="4"> A complex feature is composed of fields either these are atomic valued, and enumerated or declared as open class in FieldVals; or they are complex and their definitions are to be found elsewhere.</Paragraph>
      <Paragraph position="5"> cdef(sign,\[syn,mor\],_,\[mor:form=&gt;syn:vform\]).</Paragraph>
      <Paragraph position="6">  Computational Linguistics Volume 18, Number 3 Besides providing closure for DNF lexicon expansion, these definitions have a number of uses: 1. they are used to determine which possible paths the compiler should try to evaluate in order to build a DAG representation of a lexical sign. The search proceeds depth-first through the closure definitions, ignoring those fringes of the search space for which no evaluation is possible.</Paragraph>
      <Paragraph position="7"> Values of paths, constraints representing unevaluable reentrancies, and consistent combinations of these are returned; 2. they provide a filter on the output of the DATR lexicon. Only those features present in the closure definitions are output. Constraints incorporating functional labels such as gr_subject are no longer needed; 3. they include a complete set of knowledge definitions for our semantic representation language (SIL), which is inheritance based. The inheritance hierarchy for semantic types, for example, is used in bit coding (Section 5), so that semantic selectional restrictions can be tested during parsing; 4. they furnish a set of declarative templates against which bit coding and DAG-term conversion may be carried out.</Paragraph>
      <Paragraph position="8"> In addition to an enumeration of feature values, the closure definitions contain Feature Cooccurrence Restrictions (FCRs) (Gazdar et al. 1985). In principle these could be encoded in the DATR lexicon, for example, using the feature-value unspec to represent negative occurrence. Their presence here is not only to restrict the possible feature combinations that can appear at the head of a sign, but also to detect dependencies that govern DNF expansion.</Paragraph>
      <Paragraph position="9"> The DNF lexicon is obtained as follows. Those features on which the surface form of a full lexical sign depend, which we shall refer to as its surface form dependency features, may be derived from the FCRs contained in the closure definitions. Then for each pair consisting of a DATR node A and a possible assignment to its unassigned surface form dependency features ~, generate a new DATR node A ~, which inherits from A and contains the feature assignments in ~. The DATR theory for A ~ is then used to produce the set of evaluated and unevaluated constraint sentences that describe it. For example, the base lexical entry for arrive is defined at the DATR node Arrivei, which is underspecified for the paths &lt;syn head tense&gt;, &lt;syn head person&gt;, and &lt;syn head number&gt;. For the assignment of values pres, third, sg (respectively) to these paths, the node Arrivel_presthirdsg is created.</Paragraph>
    </Section>
    <Section position="3" start_page="259" end_page="260" type="sub_section">
      <SectionTitle>
4.2 Producing Unevaluated Paths
</SectionTitle>
      <Paragraph position="0"> As we have shown, reentrancies can be specified in DATR using global inheritance; see, for example, (15) in Section 3.4.1. However, such sentences may not appear directly in the DAG representation, either because they include paths not derivable within the closure definitions, or because interaction with higher-ranking exceptions may lead to weaker equivalences being derived. Any DATR sentence that does not explicitly introduce a value is treated as a candidate reentrancy constraint; at the stage where constraint sentences are being derived from a DATR theory, all unevaluated constraint sentences are retained. In the case of Arrivel_presthirdsg, the following constraint sentences are derived by inheritance from Verb: (27) &lt;syn args first sem&gt; = &lt;syn args gr_subject sem&gt;.</Paragraph>
      <Paragraph position="1"> &lt;sere theagent&gt; = &lt;syn args gr_subject sere&gt;.</Paragraph>
      <Paragraph position="2">  Frangois Andry et al. Making DATR Work for Speech DATR inference takes the form of successive reduction of right-hand sides; in (27), neither sentence is further reducible--both would be ignored by a standard DATR theorem-prover. By passing both constraints to the DAG-building procedure however, where equality is reflexive as well as transitive (Section 4.3), the two constraints may be combined to derive the reentrancy between &lt;sem theagent&gt; and &lt;syn args first sere&gt;.</Paragraph>
    </Section>
    <Section position="4" start_page="260" end_page="260" type="sub_section">
      <SectionTitle>
4.3 DAG Building and Disjunction Optimization
</SectionTitle>
      <Paragraph position="0"> The constraint sentences derived for a DATR node A or for an extension of it A ~ are of the form Path = Value or Pathl = Path2. If consistent, they can be used to build a DAG corresponding to A ~. Our DAG-building procedure is based on one described in Gazdar and Mellish (1989). It builds DAGs by unification of constraints, so that directionality is irrelevant. For this to succeed, the input constraints must not contain inconsistencies. This property of correctness is only partially guaranteed by the constraint-derivation stage, which will tolerate an unevaluated constraint whose left-hand side is a proper prefix of an evaluated one (but not vice versa), as in (28).</Paragraph>
      <Paragraph position="1"> (28) &lt;sem theagent type&gt; = object.</Paragraph>
      <Paragraph position="2"> &lt;sem theagent&gt; = &lt;syn args gr_subject sem&gt;.</Paragraph>
      <Paragraph position="3"> This will work so long as a contradictory type is not derivable elsewhere. The form of encoded DAGs is known as normal form (Bouma 1990); that is, if two DAGs share a common sub-DAG, this is explicitly represented in both, with the exception of unevaluated sharing sub-DAGs that are represented as Prolog variables. Once the DAG is built, any remaining unwanted paths are filtered out. In the case of Arrivel_presthirdsg, this amounts to removing those sub-DAGs introduced at paths containing gr_subj ect and gr_oblique 1.</Paragraph>
      <Paragraph position="4"> Although the closure definitions ensure that the number of surface form dependency feature assignments for each lexeme is finite, in practice for languages like English where a number of morphosyntactic feature combinations map onto a smaller set of surface forms, the DNF lexicon will have more entries than there are distinct surface forms. In cases where a number of entries differ only in a single feature, a phase of disjunction optimization serves to reduce these, according to the simple equivalence: (41 /~42 /k...4n) V (4~ /N42 A...4n) -~-- (41 V4~)/k42/k...4n.</Paragraph>
      <Paragraph position="5"> Apart from this optimization, the lexicon produced is in DNF form.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="260" end_page="262" type="metho">
    <SectionTitle>
5. Bit Coding
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="260" end_page="261" type="sub_section">
      <SectionTitle>
5.1 Motivation and Requirements
</SectionTitle>
      <Paragraph position="0"> The last step toward the production of data structures for efficient parsing and generation is the construction of two separate lexicons: a Prolog term encoding of the DAGs and a compact bit-encoded lexicon. The motivation for two separate lexicons is the decision to split the task of parsing into its two aspects: determining grammaticality and assigning an interpretation. Since in speech recognition there is also the added complication of identifying the best-scoring sequence of words from a lattice of hypotheses, and since an interpretation is only needed for the best sequence, not for every acceptable one, it is more efficient to separate these tasks. This involves separating lexical entries into those features that are constraining (i.e. which affect a sign's  Computational Linguistics Volume 18, Number 3 capacity to combine with others) and those that simply contribute to its eventual interpretation. The former set is used to produce the bit-coded 'acceptance' lexicon, the latter to form a term-encoded 'full' lexicon.</Paragraph>
      <Paragraph position="1"> As well as being used in sentence interpretation, the full lexicon is also used in sentence generation. However, we shall concentrate here on the bit-encoded acceptance lexicon.</Paragraph>
      <Paragraph position="2"> Since the search space when parsing a typical word hypothesis lattice is potentially great, the acceptance lexicon must be both compact and suitable for processing by efficient low-level operations. Bit encoding allows unification of feature structures to be performed by Boolean operations on bit strings, which enables a parser to be implemented in an efficient programming language such as C; it also provides a convenient representation of disjunctions and negations of feature values.</Paragraph>
      <Paragraph position="3"> Two distinct kinds of bit coding are used to represent semantic types and syntactic head features: both produce vectors of bits that can be stored as integers or lists of integers.</Paragraph>
    </Section>
    <Section position="2" start_page="261" end_page="261" type="sub_section">
      <SectionTitle>
5.2 Semantic Type Coding
</SectionTitle>
      <Paragraph position="0"> The principal semantic type of a lexical entry is a node in a tree-structured (singleinheritance) sortal hierarchy. Coding for types in the hierarchy is straightforward: * a terminal node has one unique bit set; * a nonterminal node is represented by the bitwise Boolean OR of the codings for the nodes it dominates.</Paragraph>
      <Paragraph position="1"> This scheme requires as many bits as there are terminal nodes in the tree and, assuming that every nonterminal node dominates at least two subnodes, assigns a unique bit vector to every node. (A simple example is given in Figure 5). The most specific types are represented by a bit vector containing a single '1,' and the most general by a vector with all its bits set. Unification of two types is performed by bitwise AND; since the hierarchy is tree structured the result of this will be the coding of the more specific type, or 0 indicating failure if the types are incompatible. The same coding scheme would also serve if the hierarchy were extended to a multiple-inheritance graph, the only difference being that bitwise AND could then result in a type distinct from either of its arguments.</Paragraph>
    </Section>
    <Section position="3" start_page="261" end_page="262" type="sub_section">
      <SectionTitle>
5.3 Syntactic Feature-Value Coding
</SectionTitle>
      <Paragraph position="0"> Our approach to the encoding of the feature structures used to represent syntactic categories is very similar to that proposed in Nakazawa et al. (1988) for implementing GPSG-style grammars.</Paragraph>
      <Paragraph position="1"> A set of features is represented by a bit vector in which for every n-valued feature, n + 1 bits are assigned, one associated with each value and one bit indicating that the feature is not present. A value of '0' for a bit means that the feature does not have the corresponding value; a '1' indicates that the value is a possible one. If the value of a feature can be specified precisely, the corresponding bit is set, and all the others for that feature are cleared. Hence the negation of a feature-value pair can be represented by clearing a bit, and a disjunction of values by setting more than one bit in the representation of a feature. This fact can be utilized to pack lexical entries together: if two entries differ only in one atomic-valued feature, they can be combined into a single entry by this method. Unification is again performed by bitwise AND; failure is indicated if all the bits for some feature are turned off, meaning that the structures  being unified have no common value for this feature. Since this operation only turns bits off, unification of bit vectors is order-independent (commutative and associative). The bit vector representation is straightforward for encoding flat feature-value structures, but presents difficulties when features have categories as values, given the requirement that the possible values for all features can be enumerated in order to produce bit vectors of finite size. Although a general solution can be proposed that uses some pattern of bits to indicate a recursive feature and associates with this feature another bit vector of the same length (the solution adopted by Nakazawa et al. 1988), we have chosen a more ad hoc encoding, specifying in advance which features can be recursive and representing them by pointers to similarly coded structures. The features that are recursive are the list of arguments of a functor sign and the slash feature used to handle long-distance dependencies2 (This approach enables the parser to process</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="262" end_page="264" type="metho">
    <SectionTitle>
4 We follow GPSG in the use of the category-valued feature slash as a propagating device to handle
</SectionTitle>
    <Paragraph position="0"> extraction phenomena. For example in the question 'what did you say?', the phrase &amp;quot;did you say?' can  Sample bit vector for head features major, vform, tense, and case. signs more efficiently, but unfortunately makes it partly dependent on their structure). The bit encoding procedure takes as input a DAG representation of individual lexical entries and is guided in its translation by the closure definitions. A set of declarations is used to indicate which features are to be included in the acceptance lexicon, and how they are to be encoded: using either of the bit representations discussed above, or simply as a list of encodings of their constituents. If no coding type is specified for a feature, then it is simply ignored.</Paragraph>
    <Paragraph position="1"> As a simple example, consider the following partially specified feature structure: (29) \[head:\[ majdegr : v vform:fin \] \] Assume that the closure definitions specify values for the head features major, vform, tense and case, and the FCR&amp;quot; case ~ major : n Then if the node head is declared for bit coding, the vector shown in Figure 6 will be produced. (The symbol &amp;quot;*' stands for 'not present'). Note that bits have been set for all values of the unspecified feature tense, indicating that nothing is known about its value, but that only the '*' bit is set for the feature case, since the FCR blocks its presence for entries whose major feature is not n.</Paragraph>
    <Section position="1" start_page="263" end_page="264" type="sub_section">
      <SectionTitle>
5.4 Variable Sharing
</SectionTitle>
      <Paragraph position="0"> Although the representation of variables and their instantiation to (more or fully) specified values is straightforward, the implementation of variable sharing or reentrancy presents a serious problem for bit coding schemes, since there is no means of representing identifiable variables. We have adopted a two-fold solution, depending on the type of the variable. For sign-valued variables, and other large scale structures, sharing is achieved by means of pointers to common data objects.</Paragraph>
      <Paragraph position="1"> This approach cannot be extended down to the level of bit-coded features, since these involve data below the level of the machine word. Instead a solution based on be partially characterized, in our notation, as syn : args : \[1 slash: first: \[ syn: \[ head: \[ major:n \] \] \] \] indicating that it is a sentence from which a noun phrase has been extracted.</Paragraph>
      <Paragraph position="2">  Frangois Andry et al. Making DATR Work for Speech the use of bit masks has been adopted. The key to this is the recognition that variable sharing between structures is a limited form of unification, carried out between a restricted set of their features. If two feature structures represented by bit vectors fll and t2 share a variable for the feature C/, a mask # is constructed in which all the bits representing C/ are cleared, and all the rest are set. The values for q~ in the two bit vectors are unified in the result of the expression: ~, A (f12 V ~) Note that a single mask may represent more than one variable shared between two structures.</Paragraph>
      <Paragraph position="3"> A disadvantage of this technique is that it requires the construction of masks for all possible feature structures within a sign between which variables may be shared. In practice we can assume that this means only the recursively nested signs of the args list and slash, and so need relatively few masks.</Paragraph>
      <Paragraph position="4"> A description of the two-stage parsing procedure can be found in Andry and Thornton (1991).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML