File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/p92-1006_intro.xml

Size: 3,463 bytes

Last Modified: 2025-10-06 14:05:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1006">
  <Title>Parsing*</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> This paper addresses the question: Why should we use probabilistic models in natural language understanding? There are many answers to this question, only a few of which are regularly addressed in the literature.</Paragraph>
    <Paragraph position="1"> The first and most common answer concerns ambigu~ ity resolution. A probabilistic model provides a clearly defined preference nile for selecting among grammatical alternatives (i.e. the highest probability interpretation is selected). However, this use of probabilistic models assumes that we already have efficient methods for generating the alternatives in the first place. While we have O(n 3) algorithms for determining the grammaticality of a sentence, parsing, as a component of a natural language understanding tool, involves more than simply determining all of the grammatical interpretations of an input. Ill order for a natural language system to process input efficiently and robustly, it must process all intelligible sentences, grammatical or not, while not significantly reducing the system's efficiency.</Paragraph>
    <Paragraph position="2"> This observ~ttiou suggests two other answers to the central question of this paper. Probabilistic models offer a convenient scoring method for partial interpretations in a well-formed substring table. High probability constituents in the parser's chart call be used to interpret ungrammat.ical sentences. Probabilistic models can also *Special I.hanks to Jerry Hobbs and F3ob Moo*re at S\[II for providing access to their colllptllel's, and to Salim \]-/oukos, Pel:er Brown, and Vincent and Steven Della Piel.ra ,-xt IF3M for their inst.ructive lessons on probabi|isti,: modelling of natural I:mguage. be used for efficiency by providing a best-first search heuristic to order the parsing agenda.</Paragraph>
    <Paragraph position="3"> This paper proposes an agenda-based probabilistic chart parsing algorithm which is both robust and efficient. The algorithm, 7)icky 1, is considered robust because it will potentially generate all constituents produced by a pure bottom-up parser and rank these constituents by likelihood. The efficiency of the algorithm is achieved through a technique called probabilistic prediction, which helps the algorithm avoid worst-case behavior. Probabilistic prediction is a trainable technique for modelling where edges are likely to occur in the chart-parsing process. 2 Once the predicted edges are added to the chart using probabilistic prediction, they are processed in a style similar to agenda-based chart parsing algorithms. By limiting the edges in the chart to those which are predicted by this model, the parser can process a sentence while generating only the most likely constituents given the input.</Paragraph>
    <Paragraph position="4"> In this paper, we will present the &amp;quot;Picky parsing algorithm, describing both the original features of the parser and those adapted from previous work. Then, we will compare the implementation of `picky with existing probabilistic and non-probabilistic parsers. Finally, we will report the results of experiments exploring how `picky's algorithm copes with the tradeoffs of efficiency, robustness, and accuracy. 3</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML