File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1026_metho.xml
Size: 23,081 bytes
Last Modified: 2025-10-06 14:14:54
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1026"> <Title>Separating Surface Order and Syntactic Relations in a Dependency Grammar</Title> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Recently, the concept of valency has gained considerable attention. Not only do all linguistic theories refer to some reformulation of the traditional notion of valency (in the form of 0grid, subcategorization list, argument list, or extended domain of locality); there is a growing number of parsers based on binary relations between words (Eisner, 1997; Maruyama, 1990).</Paragraph> <Paragraph position="1"> Given this interest in the valency concept, and the fact that word order is one of the main difference between phrase-structure based approaches (henceforth PSG) and dependency grammar (DG), it is valid to ask whether DG can capture word order phenomena without recourse to phrasal nodes, traces, slashed categories, etc. A very early result on the weak generative equivalence of context-free grammars and DGs suggested that DGs are incapable of describing surface word order (Gaifman, 1965).</Paragraph> <Paragraph position="2"> This result has recently been critizised to apply only to impoverished DGs which do not properly represent formally the expressivity of contemporary DG variants (Neuhaus & Br6ker, 1997).</Paragraph> <Paragraph position="3"> Our position will be that dependency relations are motivated semantically (Tesni~re, 1959), and need not be projective (i.e., may cross if projected onto the surface ordering). We argue for so-called word order domains, consisting of partially ordered sets of words and associated with nodes in the dependency tree. These order domains constitute a tree defined by set inclusion, and surface word order is determined by traversing this tree. A syntactic analysis therefor consists of two linked, but dissimilar trees. Sec. 2 will briefly review approaches to word order in DG. In Sec. 3, word order domains will be defined, and Sec. 4 introduces a modal logic to describe dependency structures. Sec. 5 applies our approach to the German clause and Sec. 6 relates it to some PSG approaches.</Paragraph> </Section> <Section position="3" start_page="0" end_page="174" type="metho"> <SectionTitle> 2 Word Order in DG </SectionTitle> <Paragraph position="0"> A very brief characterization of DG is that it recognizes only lexical, not phrasal nodes, which are linked by directed, typed, binary relations to form a dependency tree (Tesni~re, 1959; Hudson, 1993). The following overview of DG flavors shows that various mechanisms (global rules, general graphs, procedural means) are generally employed to lift the limitation of projectivity and discusses some shortcomings of these proposals.</Paragraph> <Paragraph position="1"> Functional Generative Description (Sgall et al., 1986) assumes a language-independent underlying order, which is represented as a projective dependency tree. This abstract representation of the sentence is mapped via ordering rules to the concrete surface realization. Recently, Kruijff (1997) has given a categorialstyle formulation of these ordering rules. He assumes associative categorial operators, permuting the arguments to yield the surface ordering. One difference to our proposal is that we argue for a representational account of word order (based on valid structures representing word order), eschewing the non-determinism introduced by unary operators; the second difference is the avoidance of an underlying structure~ which stratifies the theory and makes incremental processing difficult.</Paragraph> <Paragraph position="2"> Meaning-Text Theory (Melc'fik, 1988) assumes seven strata of representation. The rules mapping fi'om the unordered dependency trees of surface-syntactic representations onto the annotated lexeme sequences of deep-morphological representations include global ordering rules which allow discontinuities. These rules have not yet been formally specified (Melc'~tk & Pertsov, 1987p.1870.</Paragraph> <Paragraph position="3"> Word Grammar (WG, Hudson (1990)) is based on general graphs instead of trees. The ordering of two linked words is specified together with their dependency relation, as in the proposition &quot;object of verb follows it&quot;. Extraction of, e.g., objects is analyzed by establishing an additional dependency called visitor between the verb and the extractee, which requires the reverse order, as in &quot;visitor of verb precedes it&quot;. This results in inconsistencies, since an extracted object must follow the verb (being its object) and at the same time precede it (being its visitor). The approach compromises the semantic motivation of dependencies by adding purely order-induced dependencies. WG is similar to our proposal in that it also distinguishes a propositional meta language describing the graph-based analysis structures.</Paragraph> <Section position="1" start_page="174" end_page="174" type="sub_section"> <SectionTitle> Dependency Unification Grammar </SectionTitle> <Paragraph position="0"> (DUG, Hellwig (1986)) defines a tree-like data structure for the representation of syntactic analyses. 'Using morphosyntactic features with special interpretations, a word defines abstract positions into which modifiers are mapped. Partial orderings and even discontinuities can thus be described by allowing a modifier to occupy a position defined by some transitive head. The approach requires that the parser interpretes several features specially, and it cannot restrict the scope of discontinuities.</Paragraph> <Paragraph position="1"> Slot Grammar (McCord, 1990) employs a number of rule types, some of which are exclusively concerned with precedence. So-called head/slot and slot/slot ordering rules describe the precedence in projective trees, referring to arbitrary predicates over head and modifiers.</Paragraph> <Paragraph position="2"> Extractions (i.e., discontinuities) are merely handled by a mechanism built into the parser.</Paragraph> </Section> </Section> <Section position="4" start_page="174" end_page="176" type="metho"> <SectionTitle> 3 Word Order Domains </SectionTitle> <Paragraph position="0"> Summarizing the previous discussion, we require the following of a word order description for DG: * not to compromise the semantic motivation of dependencies, * to be able to restrict discontinuities to certain constructions and delimit their scope, * to be lexicalized without requiring lexical ambiguities for the representation of ordering alternatives, * to be declarative (i.e., independent of an analysis procedure), and * to be formally precise and consistent.</Paragraph> <Paragraph position="1"> The subsequent definition of an order domain structure and its linking to the dependency tree satisify these requirements.</Paragraph> <Section position="1" start_page="174" end_page="175" type="sub_section"> <SectionTitle> 3.1 The Order Domain Structure </SectionTitle> <Paragraph position="0"> A word order domain is a set of words, generalizing the notion of positions in DUG. The cardinality of an order domain may be restricted to at most one element, at least one element, or by conjunction - to exactly one element. Each word is associated with a sequence of order domains, one of which must contain the word itself, and each of these domains may require that its elements have certain features. Order domains can be partially ordered based on set inclusion: If an order domain d contains word w (which is not associated with d), every word w ~ contained in a domain d ~ associated with w is also contained in d; therefor, d ~ C d for each d ~ associated with w. This partial ordering induces a tree on order domains, which we call the order domain structure.</Paragraph> <Paragraph position="1"> Take the example of German &quot;Den Mann hat der Junge gesehen&quot; (&quot;the manAGe - has - the boyNoM - seen&quot;). Its dependency tree is shown in Fig.l, with word order domains indicated by dashed circles. The finite verb, &quot;hat&quot;, defines a sequence of domains, <dl, d2, d3>, which roughly correspond to the topological fields in the German main clause. The nouns &quot;Mann&quot; ' ,, subj_~.~_~. &quot;, :d3', ,' 'C. &quot;derJunge; '.ge~ehen.,,, ,:.den Mann-., &quot;'. .&quot; ' Mann hat der Junge gesehen&quot; aud &quot;Junge&quot; and the participle &quot;gesehen&quot; each define one order domain (d4,cl5,d6, resp.). Set inclusion gives rise to the domain structure in Fig.2, where the individual words are attached by dashed lines to their including domains (dl and d4 collapse, being identical). 1</Paragraph> </Section> <Section position="2" start_page="175" end_page="175" type="sub_section"> <SectionTitle> 3.2 Surface Ordering </SectionTitle> <Paragraph position="0"> How is the surface order derived from an order domain structure? First of all, the ordering of domains is inherited by their respective elements, i.e., &quot;Mann&quot; precedes (any element of) d2, '!hat&quot; follows (any element of) dl, etc.</Paragraph> <Paragraph position="1"> Ordering within a domain, e.g., of &quot;hat&quot; and d6, or d5 and d6, is based on precedence predicates (adapting the precedence predicates of WG). There are two different types, one ordering a word w.r.t, any other element of the domain it is associated with (e.g., &quot;hat&quot; w.r.t, d6), and another ordering two modifiers, referring to the dependency relations they occupy (d5 and d6, referring to subj and vpart). A verb like &quot;hat&quot; introduces two precedence predicates, requiring other words to follow itself and the participle to follow subject and object, resp.: 2 &quot;hat&quot; ~ (<. A (vpart) >{subj,obj}) ~Note that in this case, we have not a single rooted tree, but rather an ordered sequence of trees (by virtue of ordering dl, d2, and d3) as domain structure. In general, we assume the sentence period to govern the finite verb and to introduce a single domain for the complete sentence.</Paragraph> <Paragraph position="2"> 2For details of the notation, please refer to Sec. 4.</Paragraph> <Paragraph position="3"> Informally, the first conjunct is satisfied by any domain in which no word precedes &quot;hat&quot;, and the second conjunct is satisfied by any domain in which no subject or object follows a participle. The domain structure in Fig.2 satisfies these restrictions since nothing follows the participle, and because &quot;den Mann&quot; is not an element of d2, which contains &quot;hat&quot;. This is an important interaction of order domains and precedence predicates: Order domains define scopes for precedence predicates. In this way, we take into account that dependency trees are flatter than PS-based ones 3 and avoid the formal inconsistencies noted above for WG.</Paragraph> </Section> <Section position="3" start_page="175" end_page="176" type="sub_section"> <SectionTitle> 3.3 Linking Domain Structure and Dependency Tree </SectionTitle> <Paragraph position="0"> Order domains easily extend to discontinuous dependencies. Consider the non-projective tree in Fig.1. Assuming that the finite verb governs the participle, no projective dependency between the object &quot;den Mann&quot; and the participle &quot;gesehen&quot; can be established. We allow non-projectivity by loosening the linking between dependency tree and domain structure: A modifier (e.g., &quot;Mann&quot;) may not only be inserted into a domain associated with its direct head (&quot;gesehen&quot;), but also into a domain of a transitive head (&quot;hat&quot;), which we will call the positional head.</Paragraph> <Paragraph position="1"> The possibility of inserting a word into a domain of some trausitive head raises the questions of how to require contiguity (as needed in most cases), and how to limit the distance between the governor and the modifier in the case of discontinuity. From a descriptive viewpoint, the syntactic construction is often cited to determine the possibility and scope of discontinuities (Bhatt, 1990; Matthews, 1981). In PS-based accounts, the construction is represented by phrasal categories, and extraction is limited by bounding nodes (e.g., Haegeman (1994), Becker et al. (1991)). In dependency-based accounts, the construction is represented by the dependency relation, which is typed or labelled to indicate constructional distinctions which are configurationally defined in PSG. Given this correspondence, it is natural to employ dependencies in the description of discontinuities as fol3Note that each phrasal level in PS-based trees defines a scope for linear precedence rules, which only apply to sister nodes.</Paragraph> <Paragraph position="2"> lows: For each modifier of a certain head, a set of dependency types is defined which may link the direct head and the positional head of the modifier (&quot;gesehen&quot; and &quot;hat&quot;, resp.). If this set is empty, both heads are identical and a contiguous attachment results. The impossibility of extraction from, e.g., a finite verb phrase may follow from the fact that the dependency embedding finite verbs, propo, may not appear on any path between a direct and a positional head. 4</Paragraph> </Section> </Section> <Section position="5" start_page="176" end_page="176" type="metho"> <SectionTitle> 4 The Description Language </SectionTitle> <Paragraph position="0"> This section sketches a logical language describing the dependency structure. It is based on modal logic and owes much to work of Blackburn (1994). As he argues, standard Kripke models can be regarded as directed graphs with node annotations. We will use this interpretation to represent dependency structures. Dependencies and the mapping from dependency tree to order domain structure are described by modal operators, while simple properties such as word class, features, and cardinality of order domains are described by modal propositions.</Paragraph> <Section position="1" start_page="176" end_page="176" type="sub_section"> <SectionTitle> 4.1 Model Structures </SectionTitle> <Paragraph position="0"> In the following, we assume a set of words, l/Y, ordered by a precedence relation, -<, a set of dependency types, T), a set of atomic feature values .4, and a set of word classes, C. We define a family of dependency relations Rd C is a set of order domains where Vm, m ~ E .PS4 : mMm ~ = OVm C m'Vm ~ C m.</Paragraph> <Paragraph position="1"> 4One review pointed out that some verbs may allow extractions, i.e., that this restriction is lexical, not universal. This fact can easily be accomodated because the possibility of discontinuity (and the dependency types across which the modifier may be extracted) is described in the lexical entry of the verb. In fact, a universal restriction could not even be stated because the treatment is completely lexicalized.</Paragraph> <Paragraph position="2"> Def: A dependency structure T is a tuple (VV, Wr, R~, VA, Vc, .A4, VM > where (I,V, wr, Rz~, VA, VC> is a dependency tree, A4 is an order domain structure over ~V, and VAa : V~ ~ .All n maps words to order domain sequences.</Paragraph> <Paragraph position="3"> Additionally, we require for a dependency structure four more conditions: (1) Each word w is contained in exactly one of the domains from V~(w), (2) all domains in V~(w) are pairwise disjoint, (3) each word (except w~) is contained in at least two domains, one of which is associated with a (transitive) head, and (4) the (partial) ordering of domains (as described by VM) is consistent with the precedence of the words contained in the domains (see (Brhker, 1997) for more details).</Paragraph> </Section> <Section position="2" start_page="176" end_page="176" type="sub_section"> <SectionTitle> 4.2 The Language PS:~ </SectionTitle> <Paragraph position="0"> Fig.3 defines the logical language /:~ used to describe dependency structures. Although they have been presented differently, they can easily be rewritten as (multimodal) Kripke models: The dependency relation Rd is represented as modality (d> and the mapping from a word to its ith order domain as modality o~.5 All other formulae denote properties of nodes, and can be formulated as unary predicates - most evident for word class and feature assignment. For the precedence predicates <. and <~, there are inverses >. and >~. For presentation, the relation places C 142 x 142 has been introduced, which holds between two words iff the first argument is the positional head of the second argument.</Paragraph> <Paragraph position="1"> A more elaborate definition of dependency structures and PS~ defines two more dimensions, a feature graph mapped off the dependency tree much like the proposal of Blackburn (1994), and a conceptual representation based on terminological logic, linking content words with reference objects and dependencies with conceptual roles.</Paragraph> </Section> </Section> <Section position="6" start_page="176" end_page="178" type="metho"> <SectionTitle> 5 The German Clause </SectionTitle> <Paragraph position="0"> Traditionally, the German main clause is described using three topological fields; the initial and middle fields are separated by the finite (auxiliary) verb, and the middle and the</Paragraph> <Paragraph position="2"> final fields by infinite verb parts such as separable prefixes or participles. We will generalize this field structure to verb-initial and verb-final clauses as well, without going into the linguistic motivation due to space limits.</Paragraph> <Paragraph position="3"> The formula in Fig.4 states that all finite verbs (word class Vfin 6 C) define three order domains, of which the first requires exactly one element with the feature initial \[1\], the second allows an unspecified number of elements with features middle and norel \[2\], and the third allows at most one element with features final and norel \[3\]. The features initial, middle, and final 6 .4 serve to restrict placement of certain phrases in specific fields; e.g., no reflexive pronouns can appear in the final field. The norel 6 .4 feature controls placement of a relative NP or PP, which may appear in the initial field only in verb-final clauses. The order types are defined as follows: In a verb-second clause (feature V2), the verb is placed at the beginning (<.) of the middle field (middle), and the element of the initial field cannot be a relative phrase (o~norel in \[4\]). In a verb-final clause (VEnd), the verb is placed at the end (>.) of the middle field, with no restrictions for the initial field (relative clauses and non-relative verb-final clauses are subordinated to the noun and conjunction, resp.) \[5\]. In a verb-initial clause (Vl), the verb occupies the initial field \[6\].</Paragraph> <Paragraph position="4"> The formula in Fig.5 encodes the hierarchical structure from Fig.1 and contains lexical restrictions on placement and extraction (the surface is used to identify the word). Given this, the order type of &quot;hat&quot; is determined as follows: The participle may not be extraposed (~final in \[10\]; a restriction from the lexical entry of &quot;hat&quot;), it must follow &quot;hat&quot; in d2. Thus, the verb cannot be of order type VEnd, which would require it to be the last element in its domain (>. in \[5\]). &quot;Mann&quot; is not adjacent to &quot;gesehen&quot;, but may be extracted across the dependency vpart (${vpart} in \[11\]), allowing its insertion into a domain defined by &quot;hat&quot;. It cannot precede &quot;hat&quot; in d2, because &quot;hat&quot; must either begin d2 (due to <. in \[4\]) or itself go into dl. But dl allows only one phrase (single), leaving only the domain structure from Fig.2, and thus the order type V2 for &quot;hat&quot;.</Paragraph> </Section> <Section position="7" start_page="178" end_page="178" type="metho"> <SectionTitle> 6 Comparison to PSG Approaches </SectionTitle> <Paragraph position="0"> One feature of word order domains is that they factor ordering alternatives from the syntactic tree, much like feature annotations do for morphological alternatives. Other lexicalized grammars collapse syntactic and ordering information and are forced to represent ordering alternatives by lexical ambiguity, most notable L-TAG (Schabes et al., 1988) and some versions of CG (Hepple, 1994). This is not necessary in our approach, which drastically reduces the search space for parsing.</Paragraph> <Paragraph position="1"> This property is shared by the proposal of Reape (1993) to associate HPSG signs with sequences of constituents, also called word order domains. Surface ordering is determined by the sequence of constituents associated with the root node. The order domain of a mother node is the sequence union of the order domains of the daughter nodes, which means that the relative order of elements in an order domain is retained, but material from several domains may be interleaved, resulting in discontinuities.</Paragraph> <Paragraph position="2"> Whether an order domain allows interleaving with other domains is a parameter of the constituent. This approach is very similar to ours in that order domains separate word order from the syntactic tree, but there is one important difference: Word order domains in HPSG do not completely free the hierarchical structure from ordering considerations, because discontinuity is specified per phrase, not per modifier. For example, two projections are required for an NP, the lower one for the continuous material (determiner, adjective, noun, genitival and prepositional attributes) and the higher one for the possibly discontinuous relative clause. This dependence of hierarchical structure on ordering is absent from our proposal.</Paragraph> <Paragraph position="3"> We may also compare our approach with the projection architecture of LFG (Kaplan & Bresnan, 1982; Kaplan, 1995). There is a close similarity of the LFG projections (c-structure and f-structure) to the dimensions used here (order domain structure and dependency tree, respectively). C-structure and order domains represent surface ordering, whereas f-structure and dependency tree show the subcategorization or valence requirements. What is more, these projections or dimensions are linked in both accounts by'an element-wise mapping. The difference between the two architectures lies in the linkage of the projections or dimensions: LFG maps f-structure off c-structure. In contrast, the dependency relation is taken to be primitive here, and ordering restrictions are taken to be indicators or consequences of dependency relations (see also BrSker (1998b, 1998a)).</Paragraph> </Section> class="xml-element"></Paper>