File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1109_metho.xml
Size: 14,595 bytes
Last Modified: 2025-10-06 14:14:56
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1109"> <Title>Know When to Hold 'Em: Shuffling Deterministically in a Parser for Nonconcatenative Grammars*</Title> <Section position="3" start_page="663" end_page="664" type="metho"> <SectionTitle> 2 A German Grammar Fragment </SectionTitle> <Paragraph position="0"> The fragment is based on the analysis of German in Kathol's (1995) dissertation. Kathol's approach is a variant of HPSG, which merges insights from both Reape's work and from descriptive accounts of German syntax using topological fields (linear position classes). The fragment covers (1) root declarative (verb-second) sentences, (2) polar interrogative (verb-first) clauses and (3) embedded subordinate (verb-final) clauses, as exemplified in Figure 1.</Paragraph> <Paragraph position="1"> The linear order of constituents in a clause is represented by an order domain (DOM), which is a list of domain objects, whose relative order must satisfy a set of linear precedence (LP) constraints. The order domain for example (1) is shown in (4). Notice that each domain object contains a TOPO attribute, whose value specifies a topological field that partially determines the object's linear position in the list. Kathol defines five topological fields for German clauses: Vorfeld (v\]), Comp/Left Sentence Bracket (c\]), Mittelfeld (m\]), Verb Cluster/Right Sentence Bracket (vc), and Nachfeld (nO). These fields are ordered according to the LP constraints shown in (5).</Paragraph> <Paragraph position="2"> The hierarchical structure of a sentence, on the other hand, is constrained by a set of immediate dominance (ID) schemata, three of which are included in our fragment: Head-Argument (where &quot;Argument&quot; subsumes complements, subjects, and specifiers), Adjunct-Head, and Marker-Head. The Head- null Argument schema is shown below, along with the constraints on the order domain of the mother constituent. In all three schemata, the domain of a non-head daughter is compacted into a single domain object, which is shuffled together with the domain of the head daughter to form the domain of the mother.</Paragraph> <Paragraph position="4"> A order_constraints (V~) The hierarchical structure of (1) is shown by the unordered tree of Figure 2, where head daughters appear on the left at each branch. Focusing on the NP seiner Freundin in the tree, it is compacted into a single domain object, and must remain so, but its position is not fixed relative to the other arguments of liess (which include the raised arguments of helfen). The shuffle constraint allows this single, compacted domain object to be realized in various permutations with respect to the other arguments, subject to the LP constraints, which are implemented by the order_constraints predicate in (6). Each NP argument may be assigned either vfor mfas its TOPO value, subject to the constraint that root declarative clauses must contain exactly one element in the vf field. In this case, seiner Freundin is assigned vf, while the other NP arguments of liess are in m~ However, the following permutations of (1) are also grammatical, in which er and ihn are assigned to the vf field instead: (7) a. Er liess ihn seiner Freundin helfen.</Paragraph> <Paragraph position="5"> b. Ihn liess er seiner Freundin helfen.</Paragraph> <Paragraph position="6"> Comparing the hierarchical structure in Figure 2 with the linear order domain in (4), we see that some daughters in the hierarchical structure are realized discontinuously in the order domain for the clause (e.g., the verbal complex liess helfen). In such cases, nonconcatenative constraints, such as shuffle, can provide a more succinct analysis than concatenative rules. This situation is quite common in languages like German and Japanese, where word order is not totally fixed by grammatical relations.</Paragraph> </Section> <Section position="4" start_page="664" end_page="664" type="metho"> <SectionTitle> 3 Head-Corner Parsing </SectionTitle> <Paragraph position="0"> The grammar described above has a number of properties relevant to the choice of a parsing strategy. First, as in HPSG and other constraint-based grammars, the lexicon is information-rich, and the combinatory or phrase structure rules are highly schematic. We would thus expect a purely top-down algorithm to be inefficient for a grammar of this type, and it may even fail to terminate, for the simple reason that the search space would not be adequately constrained by the highly general combinatory rules.</Paragraph> <Paragraph position="1"> Second, the grammar is essentially nonconcatenative, i.e., constituents of the grammar may appear discontinuously in the string. This suggests that a strict left-to-right or right-to-left approach may be less efficient than a bidirectional or non-directional approach.</Paragraph> <Paragraph position="2"> Lastly, the grammar is head-driven, and we would thus expect the most appropriate parsing algorithm to take advantage of the information that a semantic head provides. For example, a head usually provides information about the remaining daughters that the parser must find, and (since the head daughter in a construction is in many ways similar to its mother category) effective top-down identification of candidate heads should be possible.</Paragraph> <Paragraph position="3"> One type of parser that we believe to be particularly well-suited to this type of grammar is the head-corner parser, introduced by van Noord (1991; 1994) based on one of the parsing strategies explored by Kay (1989). The head-corner parser can be thought of as a generalization of a left-corner parser (Rosenkrantz and Lewis-II, 1970; Matsumoto et al., 1983; Pereira and Shieber, 1987). 1 The outstanding features of parsers of this type are that they are head-driven, of course, and that they process the string bidirectionally, starting from a lexical head and working outward. The key ingredients of the parsing algorithm are as follows: * Each grammar rule contains a distinguished daughter which is identified as the head of the rule. 2 * The relation head-corner is defined as the reflexive and transitive closure of the head relation.</Paragraph> <Paragraph position="4"> * In order to prove that an input string can be parsed as some (potentially complex) goal category, the parser nondeterministically selects a potential head of the string and proves that this head is the head-corner of the goal.</Paragraph> <Paragraph position="5"> * Parsing proceeds from the head, with a rule being chosen whose head daughter can be instantiated by the selected head word. The other daughters of the rule are parsed recursively in a bidirectional fashion, with the result being a slightly larger head-corner.</Paragraph> <Paragraph position="6"> lln fact, a head-corner parser for a grammar in which the head daughter in each rule is the leftmost daughter will function as a left-corner parser.</Paragraph> <Paragraph position="7"> constructed which dominates the entire input string.</Paragraph> </Section> <Section position="5" start_page="664" end_page="664" type="metho"> <SectionTitle> 4 Implementation </SectionTitle> <Paragraph position="0"> We have implemented the German grammar and head-corner parsing algorithm described in SS2 and SS3 using the ConTroll formalism (GStz and Meurers, 1997). ConTroll is a constraint logic programming system for typed feature structures, which supports a direct implementation of HPSG. Several properties of the formalism are crucial for the approach to linearization that we are investigating: it does not require the grammar to have a context-free backbone; it includes definite relations, enabling the definition of nonconcatenative constraints, such as shuffle; and it supports delayed evaluation of constraints.</Paragraph> <Paragraph position="1"> The ability to control when relational contraints are evaluated is especially important in the optimization of shuffle to be discussed next (SS5). ConTroll also allows a parsing strategy to be specified within the same formalism as the grammar. 3 Our implementation of the head-corner parser adapts van Noord's (1997) parser to the ConTroll environment.</Paragraph> </Section> <Section position="6" start_page="664" end_page="666" type="metho"> <SectionTitle> 5 Shuffling Deterministically </SectionTitle> <Paragraph position="0"> A standard definition of the shuffle relation is given below as a Prolog predicate.</Paragraph> <Paragraph position="1"> shuffle (unoptimized version) shuffle(IS, \[\] , \[\]).</Paragraph> <Paragraph position="2"> shuffle(\[XISi\], $2, \[XIS3\]) :shuffle(SI,S2,S3). null shuffle(S1, \[XIS2S, \[XIS3\]) :shuffle(S1,S2,S3). null The use of a shuffle constraint reflects the fact that several permutations of constituents may be grammatical. If we parse in a bottom-up fashion, and the order domains of two daughter constituents are combined as the first two arguments of shuffle, multiple solutions will be possible for the mother domain (the third argument of shuffle). For example, in the structure shown earlier in Figure 2, when the domain (\[liess\],\[helfen\]) is combined with the compacted domain element (\[seiner Freundin\]), shuffle will produce three solutions: (8) a. (\[liess\],\[helfen\],\[seiner Freundin\] ) b. (\[liess\],\[seiner Freundin\],\[helfen\] ) c. (\[seiner Freundin\],\[liess\],\[helfen\] ) This set of possible solutions is further constrained in two ways: it must be consistent with the linear 3An interface from ConqYoll to the underlying Prolog environment was also developed to support some optimizations of the parser, such as memoization and the operations over bitstrings described in SS5.</Paragraph> <Paragraph position="3"> precedence constraints defined by the grammar, and it must yield a sequence of words that is identical to the input sequence that was given to the parser. However, as it stands, the correspondence with the input sequence is only checked after an order domain is proposed for the entire sentence. The order domains of intermediate phrases in the hierarchical structure are not directly constrained by the grammar, since they may involve discontinuous sub-sequences of the input sentence. The shuffle constraint is acting as a generator of possible order domains, which are then filtered first by LP constraints and ultimately by the order of the words in the input sentence. Although each possible order domain that satisfies the LP constraints is a grammatical sequence, it is useless, in the context of parsing, to consider those permutations whose order diverges from that of the input sentence. In order to avoid this very inefficient generate-and-test behavior, we need to provide a way for the input positions covered by each proposed constituent to be considered sooner, so that the only solutions produced by the shuffle constraint will be those that correspond to the order of words in the actual input sequence.</Paragraph> <Paragraph position="4"> Since the portion of the input string covered by an order domain may be discontinuous, we cannot just use a pair of endpoints for each constituent as in chart parsers or DCGs. Instead, we adapt a technique described by Reape (1991), and use bitstring codes to represent the portions of the input covered by each element in an order domain. If the input string contains n words, the code value for each constituent will be a bitstring of length n. If element i of the bitstring is 1, the constituent contains the ith word of the sentence, and if element i of the bitstring is 0, the constituent does not contain the ith word. Reape uses bitstring codes for a tabular parsing algorithm, different from the head-corner algorithm used here, and attributes the original idea to Johnson (1985).</Paragraph> <Paragraph position="5"> The optimized version of the shuffle relation is defined below, using a notation in which the arguments are descriptions of typed feature structures. The actual implementation of relations in the ConTroll formalism uses a slightly different notation, but we use a more familiar Prolog-style notation here. 4 ~, shuffle (optimized version) shuffle(\[\], \[\], \[\]).</Paragraph> <Paragraph position="6"> shuffle((Sl&ne_list), \[\], Sl).</Paragraph> <Paragraph position="7"> shuffle(\[\], (S2&ne_list), $2).</Paragraph> <Paragraph position="9"> may_pre cede_all (H2, S i), shuffle (Sl, S2, S3).</Paragraph> <Paragraph position="10"> This revision of the shuffle relation uses two auxiliary relations, code_prec and shuffle_d. code_prec compares two bitstrings, and yields a boolean value indicating whether the first string precedes the second (the details of the implementation are suppressed). The result of a comparison between the codes of the first element of each domain is used to determine which element must appear first in the resulting domain. This is implemented by using the boolean result of the code comparison to select a unique disjunct of the shuffle_d relation. The shuffle_d relation also incorporates an optimization in the checking of LP constraints. As each element is shuffled into the result, it only needs to be checked for LP acceptability with the elements of the other argument list, because the LP constraints have already been satisfied on each of the argument domains. Therefore, LP acceptability no longer needs to be checked for the entire order domain of each phrase, and the call to order_constraints can be eliminated from each of the phrasal schemata.</Paragraph> <Paragraph position="11"> In order to achieve the desired effect of making shuffle constraints deterministic, we must delay their evaluation until the code attributes of the first element of each argument domain have been instantiated to a specific string. Using the analogy of a card game, we must hold the cards (delay shuffling) until we know what their values are (the codes must be instantiated). The delayed evaluation is enforced by the following declarations in the ConTroll system, where argn:(c)type specifies that evaluation should be delayed until the value of the nth argument of the relation has a value more specific than type: delay (code_prec, (argl : @string & arg2 : @string) ).</Paragraph> <Paragraph position="12"> delay (shuffle_d, argl : (c)bool).</Paragraph> <Paragraph position="13"> With the addition of CODE values to each domain element, the input to the shuffle constraint in our previous example is shown below, and the unique solution for MDom is the one corresponding to (8c).</Paragraph> </Section> class="xml-element"></Paper>