XML Viewer - c00-2151

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/c00-2151_metho.xml
Size: 8,326 bytes
Last Modified: 2025-10-06 14:07:17
<?xml version="1.0" standalone="yes"?>
<Paper uid="C00-2151">
  <Title>An Experiment On Incremental Analysis Using Robust Parsing Techniques</Title>
  <Section position="4" start_page="1026" end_page="1026" type="metho">
    <SectionTitle>
2 Robust Parsing in a Dependency
</SectionTitle>
    <Paragraph position="0"> Framework Our grammar models utterances as depertdeTzcy t ryes, which consist of pairs of words so that one depends directly on the other. This subordination relation can be qualified by a label (e. g. to distinguish conq)lements fl:om modifiers). Since each word cml only depend on one other word, a labeled tree is formed, usually with the finite verb as its root. The decision on which stru(:ture to 1)ostulate for an utterance is guided 1)y explicit constrai~,ts, which are rel)resented as mfiversally quantified logical formulas about features of word tbrms and l)artial trees. t~r instance, one constraint might t)ostulate that a relation labeled as 'Subject' can only occur between a noun and a finite verb to its right, or that two different del)endencies of the same verb may not both be labeled 'Sul)ject'. For efficiency reasons, these formulas may constrain individual del)endency edges or pairs of edges only. The al)plication of constraints Call I)egin as soou as the first word of an utterance is read; no global information about the utterance is required for analysis of its beginning.</Paragraph>
    <Paragraph position="1"> Since natm:al language inl)ut will often exhibit irregularities such as restarts, repairs, hesitations and other gr~tmmatical errors, individual errors should not make further analysis impossible. Instead, a robust 1)arser should continue to build a structure tbr the utterance. Ideally, this structure should be close to that of a similar, but grammatical utterance.</Paragraph>
    <Paragraph position="2"> This goal is attained by annotating the constraints that constitute the grammar with scores ranging from 0 to 1. A structure that violates one or more constraints is annotated with the product of the col responding scores, and the structure with the highest combined score is defined as the solution of the parslug problem. In general, the higher the scores of the constraints, the more irregular constructions can 1)e analysed. Parsing an utterance with mmotated or soft constraints thus mnounts to multi-dimensional optimization. Both complete and heuristic search methods can be eml)loyed to solve such a t)roblem.</Paragraph>
    <Paragraph position="3"> Our robust al)proach also provides m~ easy way to implement partial parsing. If necessary, e. g., an isolated noun labeled as 'Subject' may form the root of a del)endency tree, although this would violate the first constraint lnentioned above. If a finite verl) is available, however, subordinating the noun under the verb will avoid the error and thus produce a better structure. This capability is crucial for the analysis of incomplete utterances.</Paragraph>
    <Paragraph position="4"> Different lcvcls of analysis can be defined to model syntactic as well as semantic structures. A depel&gt; dency tree is constructed for each of these levels.</Paragraph>
    <Paragraph position="5"> Since constraints can relate the edges in parallel dependency trees to each other, having several trees contributes to the robustness of the approach. Altogether, the gramlnar used in the experiments described comprises \] 2 levels of analysis and 490 constraints (SchrSder el, al., 2000).</Paragraph>
  </Section>
  <Section position="5" start_page="1026" end_page="1027" type="metho">
    <SectionTitle>
3 Prefix Parsing with Weighted
Constraints
</SectionTitle>
    <Paragraph position="0"> In general, dependency analysis is well-suited for incremental analysis. Since subordinations always concern two words rather than full constituents, each word can be integrated into the analysis as soon as it is read, although not necessarily in the ol)timal way.</Paragraph>
    <Paragraph position="1"> Also, the 1)re-colnl)uted dependency links can easily  analysis (boxnbardo, 1992).</Paragraph>
    <Paragraph position="2"> mr c/&amp;quot;\, r2 &amp;quot;- mod o&amp;quot; * \o * daal; lassen sie uns doch I hen lat you us z&amp;quot; c1.,. I</Paragraph>
    <Paragraph position="4"> dana lassen sie uns doch Then let you us &lt;part&gt; Let's appoint yet another meeting then.</Paragraph>
    <Paragraph position="6"> When assigning a det)endency structure to incomplete utterances, the problem arises how to analyse words whose governors or complements still lie beyond the time horizon. Two distinct alternatives are  tilt word and a special node representing a imtative word that is assmned to follow in the remaining input. This explicitly models the expectations that would be raised by the prefix.</Paragraph>
    <Paragraph position="7"> llowever, unifying new words with these under-specified nodes is difficult, particularly when  multiple words have been conjectured. Also, many constraints cannot be meaningfully applied to words with unknown features.</Paragraph>
    <Paragraph position="8"> 2. An incomplete prefix can be analyzed directly if a grammar is robust enough to allow partial parsing as discussed in the previous section: If the constraint that forbids multiple trees receives a severe but non-zero penalty, missing governors or complements are acceptable as long as no better structure is possible. Experiments in prefix parsing using a dependency grammar of German have shown that even complex utterances with nested subclauses can be analysed in the second way. Figure la provides an example of this: Because the infinitive verb 'ausmachen' is not yet visible, its coinplement ~Terinin' is analysed as an isolated subtree, and the main verb 'lassen' is lacking a comI)lement. After the missing verb has been read, two additional dependency edges suffice to build the correct structure from the partial parse. This method allows direct comparison between incremental and non-incremental parser runs, since both methods use tim same grammar. Therefore, we will follow up on the second alternative only and construct extended structures guided by the structures of prefixes, without explicitly modeling missing words.</Paragraph>
  </Section>
  <Section position="6" start_page="1027" end_page="1027" type="metho">
    <SectionTitle>
4 Re-Use of Partial Results
</SectionTitle>
    <Paragraph position="0"> While a prefix analysis can produce partial parses and diagnoses, so far this inforlnation has not been used in subsequent iterations. In fact, after a new word has been read, another search is conducted on all words already available. To reduce this duplication of work, we wish to narrow down the problem space for these words. Therefore, at each iteration, the set of hypotheses has to be updated: * By deciding which old dependency hypotheses should be kept.</Paragraph>
    <Paragraph position="1"> * By deciding which new dependency hypotheses should be added to the search space in order to accomodate the incoming word.</Paragraph>
    <Paragraph position="2"> For that purpose, several heuristics have been devised, based on the following principles: Prediction strength. Restrict the search space as much as possible, while maintaining correctness. null Economy. Keep as nmch of the previous structure as possible.</Paragraph>
    <Paragraph position="3"> Rightmost attachment. Attach the incoming word to the most recent words.</Paragraph>
    <Paragraph position="4"> The heuristics are presented here in increasing order of the size of the problem space they produce: A. Keep all dependency edges from the previous optimal solution. Add all dependency edges where the incoming word modifies, or is modified by, another word.</Paragraph>
    <Paragraph position="5"> B. As A, but also keep all links that differ h'om the previous optimal solution only in their lexical readings.</Paragraph>
    <Paragraph position="6"> C. As B, but also keep all links that differ fl'om the previous optimal solution only in the subordination of its last word.</Paragraph>
    <Paragraph position="7"> D. As C, but also keep all links that differ from the previous optimal solution only in the subordination of all the words lying on the path h'om the last word to the root of the solution tree. E. As D, but for all trees in the previous solution.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML