XML Viewer - w06-1604

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-1604_intro.xml
Size: 7,341 bytes
Last Modified: 2025-10-06 14:03:58
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1604">
  <Title>Detecting Parser Errors Using Web-based Semantic Filters</Title>
  <Section position="3" start_page="0" end_page="27" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Semantic processing of text in applications such as question answering or information extraction frequently relies on statistical parsers. Unfortunately, the efficacy of state-of-the-art parsers can be disappointingly low. For example, we found that the Collins parser correctly parsed just 42% of the list and factoid questions from TREC 2004 (that is, 42% of the parses had 100% precision and 100% recall on labeled constituents). Similarly, this parser produced 45% correct parses on a sub-set of 100 sentences from section 23 of the Penn Treebank.</Paragraph>
    <Paragraph position="1"> Although statistical parsers continue to improve their efficacy over time, progress is slow, particularly for Web applications where training the parsers on a &amp;quot;representative&amp;quot; corpus of hand-tagged sentences is not an option. Because of the heterogeneous nature of text on the Web, such a corpus would be exceedingly difficult to generate.</Paragraph>
    <Paragraph position="2"> In response, this paper investigates the possibility of detecting parser errors by using semantic information obtained from the Web. Our fundamental hypothesis is that incorrect parses often result in wildly implausible semantic interpretations of sentences, which can be detected automatically in certain circumstances. Consider, for example, the following sentence from the Wall Street Journal: &amp;quot;That compares with per-share earnings from continuing operations of 69 cents.&amp;quot; The Collins parser yields a parse that attaches &amp;quot;of 69 cents&amp;quot; to &amp;quot;operations,&amp;quot; rather than &amp;quot;earnings.&amp;quot; By computing the mutual information between &amp;quot;operations&amp;quot; and &amp;quot;cents&amp;quot; on the Web, we can detect that this attachment is unlikely to be correct.</Paragraph>
    <Paragraph position="3"> Our WOODWARD system detects parser errors as follows. First, it maps the tree produced by a parser to a relational conjunction (RC), a logic-based representation language that we describe in Section 2.1. Second, WOODWARD employs four distinct methods for analyzing whether a conjunct in the RC is likely to be &amp;quot;reasonable&amp;quot; as described in Section 2.</Paragraph>
    <Paragraph position="4"> Our approach makes several assumptions. First, if the sentence is absurd to begin with, then a correct parse could be deemed incorrect. Second, we require a corpus whose content overlaps at least in part with the content of the sentences to be parsed.</Paragraph>
    <Paragraph position="5"> Otherwise, much of our semantic analysis is impossible. null In applications such as Web-based question answering, these assumptions are quite natural. The  questions are about topics that are covered extensively on the Web, and we can assume that most questions link verbs to nouns in reasonable combinations. Likewise, when using parsing for information extraction, we would expect our assumptions to hold as well.</Paragraph>
    <Paragraph position="6"> Our contributions are as follows:  parses from bad on TREC 2004 questions for a reduction of 67% in error rate. On a harder set of sentences from the Penn Treebank, the reduction in error rate is 20%.</Paragraph>
    <Paragraph position="7"> The remainder of this paper is organized as follows. We give an overview of related work in Section 1.1. Section 2 describes semantic filtering, including our RC representation and the four Web-based filters that constitute the WOODWARD system. Section 3 presents our experiments and results, and section 4 concludes and gives ideas for future work.</Paragraph>
    <Section position="1" start_page="27" end_page="27" type="sub_section">
      <SectionTitle>
1.1 Related Work
</SectionTitle>
      <Paragraph position="0"> The problem of detecting parse errors is most similar to the idea of parse reranking. Collins (2000) describes statistical techniques for reranking alternative parses for a sentence. Implicitly, a reranking method detects parser errors, in that if the reranking method picks a new parse over the original one, it is classifying the original one as less likely to be correct. Collins uses syntactic and lexical features and trains on the Penn Treebank; in contrast, WOODWARD uses semantic features derived from the web. See section 3 for a comparison of our results with Collins'.</Paragraph>
      <Paragraph position="1"> Several systems produce a semantic interpretation of a sentence on top of a parser. For example, Bos et al. (2004) build semantic representations from the parse derivations of a CCG parser, and the English Resource Grammar (ERG) (Toutanova et al., 2005) provides a semantic representation using minimal recursion semantics. Toutanova et al.</Paragraph>
      <Paragraph position="2"> also include semantic features in their parse selection mechanism, although it is mostly syntaxdriven. The ERG is a hand-built grammar and thus does not have the same coverage as the grammar we use. We also use the semantic interpretations in a novel way, checking them against semantic information on the Web to decide if they are plausible. null NLP literature is replete with examples of systems that produce semantic interpretations and use semantics to improve understanding. Several systems in the 1970s and 1980s used hand-built augmented transition networks or semantic networks to prune bad semantic interpretations.</Paragraph>
      <Paragraph position="3"> More recently, people have tried incorporating large lexical and semantic resources like WordNet, FrameNet, and PropBank into the disambiguation process. Allen (1995) provides an overview of some of this work and contains many references.</Paragraph>
      <Paragraph position="4"> Our work focuses on using statistical techniques over large corpora, reducing the need for hand-built resources and making the system more robust to changes in domain.</Paragraph>
      <Paragraph position="5"> Numerous systems, including Question-Answering systems like MULDER (Kwok et al., 2001), PiQASso (Attardi et al., 2001), and Moldovan et al.'s QA system (2003), use parsing technology as a key component in their analysis of sentences. In part to overcome incorrect parses, Moldovan et al.'s QA system requires a complex set of relaxation techniques. These systems would greatly benefit from knowing when parses are correct or incorrect. Our system is the first to suggest using the output of a QA system to classify the input parse as good or bad.</Paragraph>
      <Paragraph position="6"> Several researchers have used pointwise mutual information (PMI) over the Web to help make syntactic and semantic judgments in NLP tasks.</Paragraph>
      <Paragraph position="7"> Volk (2001) uses PMI to resolve preposition attachments in German. Lapata and Keller (2005) use web counts to resolve preposition attachments, compound noun interpretation, and noun countability detection, among other things. And Markert et al. (2003) use PMI to resolve certain types of anaphora. We use PMI as just one of several techniques for acquiring information from the Web.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML