XML Viewer - w98-0504

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-0504_metho.xml
Size: 27,966 bytes
Last Modified: 2025-10-06 14:15:06
<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-0504">
  <Title>r g ! How to define a context-free backbone for DGs: Implementing a DG in the LFG formalism</Title>
  <Section position="3" start_page="29" end_page="30" type="metho">
    <SectionTitle>
2 Word Order in DG
</SectionTitle>
    <Paragraph position="0"> A very brief characterization of DG is that it recognizes only lexical, not phrasal nodes, which are linked by directed, typed, binary relations to form a dependency tree (Tesni~re, 1959; Hudson, 1993). If these relations are motivated semantically, such dependency trees can be non-projective. Consider the extracted NP in &amp;quot;Beans, I know John likes&amp;quot;. A projective tree would require &amp;quot;Beans&amp;quot; to be connected to either &amp;quot;1&amp;quot; or &amp;quot;know&amp;quot; - none of which is conceptually directly related to &amp;quot;Beans&amp;quot;. It is &amp;quot;likes&amp;quot; that determines syntactic features of &amp;quot;Beans&amp;quot; and which provides a semantic role for it. The only connection between &amp;quot;know&amp;quot; and &amp;quot;Beans&amp;quot; is that the finite verb allows the extraction of&amp;quot;Beans&amp;quot;, thus defining order restrictions for the NP. The following overview of DG flavors shows that various mechanisms (global rules, general graphs, procedural means) are generally employed to lift the limitation of projectivity and discusses some shortcomings of these proposals.</Paragraph>
    <Paragraph position="1"> Functional Generative Description (Sgall et al., 1986) assumes a language-independent underlying order, which is represented as a projective dependency tree. This abstract representation of the sentence is mapped via ordering rules to the concrete surface realization. Recently, Kruijff (1997) has given a categorialstyle formulation of these ordering rules. He assumes associative categorial operators, permuting the arguments to yield the surface ordering.</Paragraph>
    <Paragraph position="2"> One difference to our proposal is that we argue for a representational account of word order (based on valid structures representing word order), eschewing the non-determinism introduced by unary categorial operators; the second difference is the avoidance of an underlying structure, which stratifies the theory and makes incremental processing difficult.</Paragraph>
    <Paragraph position="3"> Meaning-Text Theory (Melc'flk, 1988) assumes seven strata of representation. The rules mapping from the unordered dependency trees of surface-syntactic representations onto the annotated lexeme sequences of deep-morphological representations include global ordering rules which allow discontinuities. These rules have not yet been formally specified (Melc'~k &amp; Pertsov, 1987p.187f) (but see the proposal by Rambow &amp; Joshi (in print)).</Paragraph>
    <Paragraph position="4">  Word Grammar (WG, Hudson (1990)) is based on general graphs instead of trees. The ordering of two linked words is specified together with their dependency relation, as in the proposition &amp;quot;object of verb follows it&amp;quot;. Extraction of, e.g., objects is analyzed by establishing an additional dependency called visitor between the verb and the extractee, which requires the reverse order, as in &amp;quot;visitor of verb precedes it&amp;quot;. Resulting inconsistencies, e.g. in case of an extracted object, are not resolved. This approach compromises the semantic motivation of dependencies by adding purely order-induced dependencies.</Paragraph>
    <Section position="1" start_page="30" end_page="30" type="sub_section">
      <SectionTitle>
Dependency Unification Grammar
</SectionTitle>
      <Paragraph position="0"> (DUG, Hellwig (1986)) defines a tree-like data structure for the representation of syntactic analyses. Using morphosyntactic features with special interpretations, a word defines abstract positions into which modifiers are mapped. Partial orderings and even discontinuities can thus be described by allowing a modifier to occupy a position defined by some transitive head. The approach requires that the parser interprets several features in a special way, and it cannot restrict the scope of discontinuities.</Paragraph>
      <Paragraph position="1"> Slot Grammar (McCord, 1990) employs a number of rule types, some of which are exclusively concerned with precedence. So-called head/slot and slot/slot ordering rules describe the precedence in projective trees, referring to arbitrary predicates over head and modifiers.</Paragraph>
      <Paragraph position="2"> Extractions (i.e., discontinuities) are merely handled by a mechanism built into the parser.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="30" end_page="32" type="metho">
    <SectionTitle>
3 Word Order Domains
</SectionTitle>
    <Paragraph position="0"> Extending the previous discussion, we require the following of a word order description for DG: * not to compromise the semantic motivation of dependencies, * to be able to restrict discontinuities to certain constructions and delimit their scope, * to be lexicalized without requiring lexical ambiguities for the representation of ordering alternatives, * to be declarative (i.e., independent of an analysis procedure), and</Paragraph>
    <Paragraph position="2"> * to be formally precise and consistent.</Paragraph>
    <Paragraph position="3"> The subsequent definition of an order domain structure and its linking to the dependency tree satisify these requirements.</Paragraph>
    <Section position="1" start_page="30" end_page="30" type="sub_section">
      <SectionTitle>
3.1 The Order Domain Structure
</SectionTitle>
      <Paragraph position="0"> A word order domain is a set of words, generalizing the notion of positions in DUG. The cardinality of an order domain may be restricted to at most one element, at least one element, or - by conjunction - to exactly one element.</Paragraph>
      <Paragraph position="1"> Each word is associated with a sequence of order domains, one of which must contain the word itself, and each of these domains may require that its elements have certain features. Order domains can be partially ordered based on set inclusion: If an order domain d contains word w (which is not associated with d), every word w' contained in a domain d t associated with w is also contained in d; therefore, d' C d for each d' associated with w. This partial ordering induces a tree on order domains, which we call the order domain structure. The order domain structure constitutes a projective tree over the input, where order domains loosely correspond to partial phrases.</Paragraph>
      <Paragraph position="2">  (1) Den Mann hat der Junge gesehen.</Paragraph>
      <Paragraph position="3"> the manAcc has the bOyNOM seen 'The boy has seen the man.'  Take the German example (1). Its dependency tree is shown in Fig. 1, with word order domains indicated by dashed circles. The finite verb, &amp;quot;hat&amp;quot;, defines a sequence of domains, (dl, d2, d3), which roughly correspond to the topological fields in the German main clause. The nouns and the participle each define a single order domain. Set inclusion gives rise to the domain structure in Fig. 2, where the individual words are attached by dashed lines to their including domains.</Paragraph>
    </Section>
    <Section position="2" start_page="30" end_page="31" type="sub_section">
      <SectionTitle>
3.2 Surface Ordering
</SectionTitle>
      <Paragraph position="0"> How is the surface order derived from an order domain structure? First of all, the ordering of domains is inherited by their respective elements, i.e., &amp;quot;Mann&amp;quot; precedes (any element of) d2, &amp;quot;hat&amp;quot; follows (any element of) dl, etc.</Paragraph>
      <Paragraph position="1"> Ordering within a domain, e.g., of &amp;quot;hat&amp;quot; and d6, or ds and d6, is based on precedence predicates (adapting the precedence predicates of WG). There are two different types, one ordering a word with respect to any other element of the domain it is associated with (e.g., &amp;quot;hat&amp;quot; with respect to d6), and another ordering two moditiers, referring to the dependency relations they occupy (d5 and d6, referring to subj and vpart).</Paragraph>
      <Paragraph position="2"> A verb like &amp;quot;hat&amp;quot; introduces three precedence predicates, requiring other words (within the same domain) to follow itself and the participle to follow subject and object, resp.: 1 &amp;quot;hat&amp;quot; =&gt; &lt;.</Paragraph>
      <Paragraph position="3"> A subj &lt; vpart A obj &lt; vpart Informally, the first conjunct is satisfied by ally domain in which no word precedes &amp;quot;hat&amp;quot;, and the second and third conjuncts are satisfied by any domain ill which no subject or object follows a participle (vpart). The obj must be mentioned for &amp;quot;hat&amp;quot;, although &amp;quot;hat&amp;quot; does not directly govern objects, because objects may be placed by &amp;quot;hat&amp;quot;, and not their immediate governors. The domain structure in Fig.2 satisfies these restrictions since nothing follows the participle, and because &amp;quot;den Mann&amp;quot; is not an element of (\]2, which contains &amp;quot;hat&amp;quot;. This is an important interaction of order domains and precedence predicates: Order domains define scopes</Paragraph>
      <Paragraph position="5"> for precedence predicates. In this way, we take into account that dependency trees are flatter than PS-based ones 2 and avoid the formal inconsistencies noted above for WG.</Paragraph>
    </Section>
    <Section position="3" start_page="31" end_page="32" type="sub_section">
      <SectionTitle>
3.3 Linking Domain Structure and
Dependency Tree
</SectionTitle>
      <Paragraph position="0"> Order domains easily extend to discontinuous dependencies. Consider the non-projective tree in Fig.1. Assuming that the finite verb governs the participle, no projective dependency between the object &amp;quot;den Mann&amp;quot; and the participle &amp;quot;gesehen&amp;quot; can be established. We allow non-projectivity by loosening the linking between dependency tree and domain structure: A modifier (e.g., &amp;quot;Mann&amp;quot;) may not only be inserted into a domain associated with its direct head (&amp;quot;gesehen&amp;quot;), but also into a domain of a transitive head (&amp;quot;hat&amp;quot;), which we will call the positional head.</Paragraph>
      <Paragraph position="1"> The possibility of inserting a word into a domain of some transitive head raises the questions of how to require continuity (as needed in nmst cases), and how to limit the distance between the governor and the modifier. Both questions will be soh,ed with reference to the dependency relation. From a descriptive viewpoint, the syntactic construction is often cited to determine the possibility and scope of discontinuities (Bhatt, 1990; Matthews, 1981). In PS-based accounts, the construction is represented by phrasal categories, and extraction is limited 1)3-&amp;quot; bounding nodes (e.g., Haegeman (1994), Becker et al. (1991)). In dependency-based accounts, the construction is represented by the dependency relation, which is typed or labelled to indicate constructional distinctions which are configurationally defined in PSG. Given this correspondence, it is natural to employ dependencies in the description of discontinuities as follows: For each modifier, a set of dependency types is defined which may link the direct head and the positional head of the modifier (&amp;quot;gesehen&amp;quot; and &amp;quot;hat&amp;quot;, respectively). If this set is empty, both heads are identical and a continuous attachment results. The impossibility of extraction from, e.g., a finite verb phrase follows from the fact that the dependency embedding finite verbs, propo, may not appear on any path 2Note that each phrasal level in PS-based trees defines a scope for linear precedence rules, which only apply to sister nodes.</Paragraph>
      <Paragraph position="2">  between a direct and a positional head.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="32" end_page="33" type="metho">
    <SectionTitle>
4 A Brief Review of LFG
</SectionTitle>
    <Paragraph position="0"> This section introduces key concepts of LFG which are of interest in Sec. 5 and is necessarily very short. Further information can be found in Bresnan &amp; Kaplan (1982) and Dalrymple et al.</Paragraph>
    <Paragraph position="1"> (1995).</Paragraph>
    <Paragraph position="2"> LFG posits several different representation levels, called projections. Within a projection, a certain type of linguistic knowledge is represented, which explains differences in the formal setup (data types and operations) of the projections. The two standard projections, and those used here, are the constituent (c-) structure and the functional (f-) structure (Kaplan (1995) and Halvorsen &amp; Kaplan (1995) discuss the projection idea in more detail). C-structure is defined in terms of context-free phrase structure rules, and thus forms a projective tree of categories over the input. It is assumed to encode language particularities with respect to the set of categories and the possible orderings. The f-structure is constructed fi'om additional annotations attached to the phrase structure rules, and has the form of an attribute-value matrix or feature structure. It is assumed to represent more or less langnage-independent information about grammatical functions and predicate-argument structure. In addition to the usual unification operation, LFG employs existential and negative constraints on features, which allow the fornmlation of constraints about the existence of features without specifying the associated value.</Paragraph>
    <Paragraph position="3"> Consider the following rules, which are used for illustration only and do not constitute a canonical LFG analysis.</Paragraph>
    <Paragraph position="5"> Assuming reasonable lexical insertion rules, the context-free part of these rules assigns the c-structure to the left of Fig. 3 to example (1). The annotations are associated with right-hand side elements of the rules and define the</Paragraph>
    <Paragraph position="7"> f-structure of the sentence, which is displayed to the right of Fig. 3. Each c-structure node is associated with an f-structure node as shown by the arrows. The f-structure node associated with the left-hand side of a rule may be accessed with the $ metavariable, while the f-structure node of a right-hand side element may be accessed with the $ metavariable. The mapping from c-structure nodes to f-structure nodes is not oneto-one, however, since the feature structures of two distinct c-structure nodes may be identified (via the $=$ annotation), and additional embedded features may be introduced (such as CASE). Assuming that only finite verbs carry the TENSE feature, the existential constraint ($TENSE) requires a finite verb at the beginning of the VP, while the negative constraint .~($TENSE) forbids finite verbs at the end of the VP. Note that unspecified feature structures are displayed as \[ \] in the figure, and that much more information (esp. predicate-argument information) will come from the lexical entries.</Paragraph>
    <Paragraph position="8"> Another important construct of LFG is functional uncertainty (Kaplan &amp; Zaenen, 1995; Kaplan &amp; Maxwell, 1995). Very often (most notably, in extraction or control constructions) the path of f-structure attributes to write down is indeterminate. In this case, one may write down a description of this path (using a regular language over attribute names) and let the parser check every path described (possibly resulting in ambiguities warranted by f-structure differences only). Our little grammar may be extended to take advantage of functional uncertainty in two ways. First, if you want to permute subject and object (as is possible in German), you might change the S rule to the following:</Paragraph>
    <Paragraph position="10"> The f-structure node of the initial NP may now be inserted in either the OBJ or the SUBJ attribute of the sentence's f-structure, which is expressed by the disjunction {OBJiSUBJ} in the annotation. (Of course, you have to restrict the CASE feature suitably, which can be done in the verb's subcategorization.) The other regular notation which we will use is the Kleene star.</Paragraph>
    <Paragraph position="11"> Assume a different f-structure analysis, where the object of infinite verbs is embedded under VCOMP. The S rule from above would have to be changed to the following:</Paragraph>
    <Paragraph position="13"> But this rule will only analyse verb groups with zero or one auxiliary, because the VCOMP attribute is optional in the path description.</Paragraph>
    <Paragraph position="14"> Examples like Den Mann will der Junge gesehen haben with several auxiliaries are not covered, because the main verb is embedded under (VCOMP VCOMP). The natural solution is to use the Kleene star as follows, which allows zero or more occurrences of the attribute VCOMP.</Paragraph>
    <Paragraph position="16"> A property which is important for our use of functional uncertainty is already evident from these examples: Functional uncertainty is nonconstructive, i.e., the attribute paths derived from such an annotation are not constructed anew (which in case of the Kleene star would lead to infinitely many solutions), but must already exist in the f-structure.</Paragraph>
    <Paragraph position="18"/>
  </Section>
  <Section position="6" start_page="33" end_page="117" type="metho">
    <SectionTitle>
5 Encoding DG in LFG 5.2 Topological fields
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="33" end_page="34" type="sub_section">
      <SectionTitle>
5.1 The Implementation Plattform As we have seen in Sec. 3, the order domain
</SectionTitle>
      <Paragraph position="0"> The plattform used is the Xerox Lin- structure is a projective tree over the input. So guistic Environment (XLE, see also it is natural to encode the domain structure in http://www.parc.xerox, com/istl/groups/nltt/xlef~ ntext'free rules, resulting in a tree as shown which implements a large part of LFG theory in Fig. 4. Categories which have a status as orplus a number of abbreviatory devices. It der domains are named dora*, to be distinguishincludes a parser, a generator, support for two-level morphology and different types of lexica as well as a user-friendly graphical interface with the ability to browse through the set of analyses, to work in batch mode for testing purposes, etc.</Paragraph>
      <Paragraph position="1"> We will be using two abbreviatory devices below, which are shortly introduced here. Both do not show up in the final output, rather they allow the grammar writer to state various generalizations more succintly. The first is the so-called metacategory, which allows several c-structure categories to be merged into one. So if we are writing (2), we introduce a metacategory domVfin (representing the domain sequence of finite verbs) to be used in other rules, but we will never see such a category in the c-structure.</Paragraph>
      <Paragraph position="2"> Rather, the expansion of the metacategory is directly attached to the mother node of the metacategory (cf. Fig. 4).</Paragraph>
      <Paragraph position="3"> (2) domVfin = domINITIAL domMIDDLE domFINAL able from preterminal categories (such as Vfin, I, ... ; these cannot be converted to metacategories). As notational convention, domC will be the name of the (meta)category defining the order domain sequence for a word of class C. Eliminating the preterminal categories yields exactly the domain structure given in Fig. 2.</Paragraph>
      <Paragraph position="4"> A complete algorithmic description of how to derive phrase-structure rules from order domain definitions would require a lenghty introduction to more of XLE's c-structure constructs, and therefore we illustrate the conversion with hand-coded rules. For example, a noun introduces one order domain without cardinality restrictions. Assuming a metacategory DOMAIN standing for an arbitrary domain, we define the following rules for the domain sequences of nouns, full stops, and determiners:</Paragraph>
      <Paragraph position="6"> The second abbreviatory construct is the template, which groups several functional annotations under one heading, possibly with some parameters. A very important template is the VALENCY template defined in (3), which defines a dependency relation on f-structure (see below for discussion). We require three parameters (each introduced by underscore), the first of which indicates optionality (opt vs. req values), the second gives the name of the dependency relation, and the third the word class required of the modifier. (4) shows a usage of a template, which begins with an @ (at) sign and lists the template name with any parameters enclosed in parentheses.</Paragraph>
      <Paragraph position="7">  A complex example is the finite verb, which introduces three domains, each with different cardinality restrictions. This is encoded in the following rules: domVfin ffi domINITIAL domMIDDLE domFINAL.</Paragraph>
      <Paragraph position="8">  (6) domINITIAL~ DOMAIN.</Paragraph>
      <Paragraph position="10"> Note tile use of a metacategory here, which does not appear in tlle c-structure output (as seen in Fig. 4), but still allows you to refer to all elements placed by a finite verb in one word.</Paragraph>
      <Paragraph position="11"> The definition of DOMAIN is trivial: It is just a metacategory expandable to every domain: 3 aA number of efficiency optimizations can be directly compiled into these c-structure rules. Mentioning DOMAIN is much too permissive in most cases (e.g., within the NP), and can be optimized to allow only domains introduced by words which may actually be modifiers at this point.</Paragraph>
      <Paragraph position="12">  (7) DOMAIN = { domVfin I domI I domN I domD }.</Paragraph>
    </Section>
    <Section position="2" start_page="34" end_page="34" type="sub_section">
      <SectionTitle>
5.3 valencies and Dependency
Relations
</SectionTitle>
      <Paragraph position="0"> The dependency tree is, at least in our approach, an unordered tree with labelled relations between nodes representing words. This is very similar to the formal properties of the fstructure, which we will therefore use to encode it. We have already presented the VALENCY template in (3) and will now explain it. {.-- I &amp;quot;&amp;quot;} represents a disjunction of possibilities, and the parameter _o (for optionality)controls their selection. In case we provide the opt value, there * is an option to forbid the existence of the dependency, expressed by the negative constraint --~($_d). Regardless of the value of _o, there is another option to introduce an attribute named _d (for dependency) which contains a CLASS attribute with a value specified by the third parameter, _c. The existential constraint for the LEXEME attribute requires that some other word (which specifies a LEXFA~IE) is unified into the feature _d, thereby filling this valency slot.</Paragraph>
      <Paragraph position="1"> The use of a defining constraint for the CLASS attribute constructs the feature, allowing nonconstructive functional uncertainty to fill in the modifier (as explained below).</Paragraph>
      <Paragraph position="2"> A typical lexical entry is shown in (8), where the surface form is followed by the c-structure category and some template invocations. These expand to annotations defining the CLASS and LEXEME features, and use the VALENCY template to define the valency frame.</Paragraph>
    </Section>
    <Section position="3" start_page="34" end_page="117" type="sub_section">
      <SectionTitle>
5.4 Continuous and Discontinuous
Attachment
</SectionTitle>
      <Paragraph position="0"> So far we get only a c-structure where words are associated with f-structures containing valency frames. To get the f-structure shown in Fig. 5~ (numbers refer to c-structure node numbers of Fig. 4) we need to establish dependency relations, i.e., need to put the f-structures associated with preterminal nodes together into one large f-structure. Establishing dependency relations between the words relies heavily on the mechanism of functional uncertainty. First, we must identify on f-structure the head of each order domain sequence. For this, we annotate in every c-structure rule the category of the head word with the template ~(HEAV), which identifies the head word's f-structure with the order domain's f-structure (cf. (9)). Second, all other c-structure categories (which represent modifiers) are annotated with the ~(MODIFIER) template defined in (10). This template states that the f-structure of the modifier (referenced by .~) may be placed under some dependency attribute path of the f-structure of the head (referenced by ~). These paths are of the form p d, where p is a (possibly empty) regular expression over dependency attributes, and d is a dependency attribute, d names the dependency relation the modifier finally fills, while p describes the path of dependencies which may separate the positional from the direct head of the modifier. The MODIFIER template thus completely describes the legal discontinuities: If p is empty for a dependency d, modifiers in dependency d are always continuously attached (i.e., in an order domain defined by their direct head). This is thecase for the subject (in dependency SUB J) and the determiner (in dependency SPEC), in this example. On the other hand, a non-empty path p allows the modifier to 'float up' the dependency tree to any transitive head reachable via p. In our example, objects depending on participles may thus float into domains of the finite verb (across VPART dependencies), and relative clauses (in dependency RELh) may float from the noun's domain into the finite verb's domains.</Paragraph>
      <Paragraph position="1">  (9) HEAD = I=$.</Paragraph>
      <Paragraph position="3"> The grammar defined so far overgenerates in that, e.g., relative clauses may be placed into the middle field. To require placement in specific domains, additional features are used, which distinguish topological fields (e.g., via ($FIELD) = middle annotations on c-structure). A relative clause can then be constrained to occur only in the final field by adding constraints on these features. This mechanism is very similar to describing agreement or government (e.g., of case or number), which also uses standard features not discussed here. With these additions, the final rules for finite verbs look as follows:</Paragraph>
    </Section>
    <Section position="4" start_page="117" end_page="117" type="sub_section">
      <SectionTitle>
5.5 Missing Links
</SectionTitle>
      <Paragraph position="0"> As is to be expected if you use something for purposes it was not designed to be used for, there are some missing links. The most prominent one is the lack of binary precedence predicates over dependency relations. There is, however, a close relative, which might be used for implementing precedence predicates. Zaenen &amp; Kaplan (1995) introduced f-precedence &lt;! into LFG, which allows to express on f-structure constraints on the order of the c-structure nodes mapping to the current f-structure. So we might write the following annotations to order the finite verb with respect to its modifiers, or to or- null der subject and object.</Paragraph>
      <Paragraph position="1"> (12) (T) &lt;/ (T{SUBJIOBJ\[VPART}).</Paragraph>
      <Paragraph position="2"> (J'SUBJ) &lt;/ (J'oBa).</Paragraph>
      <Paragraph position="3">  Tile problem with f-precedence, however, is that is does not respect the scope restrictions which we defined for precedence predicates* I.e., a topicalized object is not exempt from the above constraints, and thus would result in parsing failure. To restrict the scope of f-precedence to order domains (aka, certain c-structure categories) would require an explicit encoding of these domains on f-structure*</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML