File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/97/p97-1062_relat.xml

Size: 2,654 bytes

Last Modified: 2025-10-06 14:16:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1062">
  <Title>Learning Parse and Translation Decisions</Title>
  <Section position="8" start_page="487" end_page="488" type="relat">
    <SectionTitle>
7 Related Work
</SectionTitle>
    <Paragraph position="0"> Our basic parsing and interactive training paradigm is based on (Simmons and Yu, 1992). We have extended their work by significantly increasing the expressiveness of the parse action and feature languages, in particular by moving far beyond the few simple features that were limited to syntax only, by adding more background knowledge and by introducing a sophisticated machine learning component.</Paragraph>
    <Paragraph position="1"> (Magerman, 1995) uses a decision tree model similar to ours, training his system SPATTER. with parse action sequences for 40,000 Wall Street Journal sentences derived from the Penn Treebank (Marcus et al., 1993). Questioning the traditional n-grams, Magerman already advocates a heavier reliance on contextual information. Going beyond Magerman's still relatively rigid set of 36 features, we propose a yet richer, basically unlimited feature language set.</Paragraph>
    <Paragraph position="2"> Our parse action sequences are too complex to be derived from a treebank like Penn's. Not only do our parse trees contain semantic annotations, roles and more syntactic detail, we also rely on the more informative parse action sequence. While this necessitates the involvement of a parsing supervisor for training, we are able to perform deterministic parsing and get already very good test results for only 256 training sentences.</Paragraph>
    <Paragraph position="3"> (Collins, 1996) focuses on bigram lexical dependencies (BLD). Trained on the same 40,000 sentences as Spatter, it relies on a much more limited type of context than our system and needs little background knowledge.</Paragraph>
    <Paragraph position="4">  Magerman's SPATTER, and Collins' BLD; results for SPATTER, and BLD are for sentences of up to 40 words.</Paragraph>
    <Paragraph position="5"> Table 6 compares our results with SPATTER, and BLD. The results have to be interpreted cautiously since they are not based on the exact same sentences and detail of bracketing. Due to lexical restrictions, our average sentence length (17.1) is below the one used in SPATTER and BLD (22.3), but some of our test sentences have more than 40 words; and while the Penn Treebank leaves many phrases such as &amp;quot;the New York Stock Exchange&amp;quot; without internal structure, our system performs a complete bracketing, thereby increasing the risk of crossing brackets.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML