File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/h92-1025_evalu.xml

Size: 5,520 bytes

Last Modified: 2025-10-06 14:00:08

<?xml version="1.0" standalone="yes"?>
<Paper uid="H92-1025">
  <Title>Probabilistic Prediction and Picky Chart Parsing*</Title>
  <Section position="5" start_page="131" end_page="132" type="evalu">
    <SectionTitle>
4. Results of Experiments
</SectionTitle>
    <Paragraph position="0"> The Picky parser was tested on 3 sets of 100 sentences which were held out from the rest of the corpus during training. The training corpus consisted of 982 sentences which were parsed using the same grammar that Picky used. The training and test corpora are samples from the MIT's Voyager direction-finding system. 6 Our experiments explored the accuracy, efficiency, and robustness of the &amp;quot;Picky algorithm.</Paragraph>
    <Paragraph position="1"> However, we do not anticipate a significant improvement in accuracy, since the two parsers use similar language models. On the other hand, &amp;quot;Picky should outperform Pearl in terms of robustness and efficiency.</Paragraph>
    <Section position="1" start_page="131" end_page="132" type="sub_section">
      <SectionTitle>
4.1. Robustness
</SectionTitle>
      <Paragraph position="0"> Since our test sets did not contain many ungrammatical sentences, it was difficult to analyze Ticky's robustness.</Paragraph>
      <Paragraph position="1"> It is undeniable that Picky will produce a fuller chart than will &amp;quot;Pearl, making partial parsing of ungrammatical sentences possible. We leave it to future experiments to explore empirically the effectiveness of semantic interpretation using Picky's probabilistic well-formed sub-string table.</Paragraph>
      <Paragraph position="2"> One interesting example did occur in one test set. The sentence &amp;quot;How do I how do I get to MITT' is a ungrammatical but interpretable sentence which begins with a restart. Pearl would have generated no analysis for the latter part of the sentence and the corresponding sections of the chart would be empty. Using bidirectional probabilistic prediction, Picky produced a correct partial interpretation of the last 6 words of the sentence, &amp;quot;how do I get to MIT?&amp;quot; One sentence does not make for conclusive evidence, but it represents the type of improvements  phase which the parser reached in processing the test sentences.</Paragraph>
      <Paragraph position="3"> As we expected, Picky's parsing accuracy compares favorably to Pearl's performance. As shown in Figure 1, &amp;quot;Picky parsed the test sentences with an 89.3% accuracy rate. This is a slight improvement over &amp;quot;Pearl's 87.5% accuracy rate reported in \[10\].</Paragraph>
      <Paragraph position="4"> But note the accuracy results for phases I and II. These phases include sentences which are parsed successfully by the probabilistic prediction mechanism. Almost 80% of the test sentences fall into this category, and 97% of these sentences are parsed correctly. This result is very significant because it provides a reliable measure of the confidence the parser has in it's interpretation. If incorrect interpretations are worse than no interpretation at all, a natural language system might consider only parses which are generated in phases I and II. This would limit coverage, but would allow the system to have a high degree of confidence in the parser output.</Paragraph>
      <Paragraph position="5">  categorized by the phase which the parser reached in processing the test sentences.</Paragraph>
      <Paragraph position="6"> The effectiveness of the prediction model also leads to increased efficiency. Figure 2 shows the average number of edges predicted and completed by sentences, again partitioned by phase of parse completion. Also included in the table is the average number of constituents in the &amp;quot;correct&amp;quot; parse.</Paragraph>
      <Paragraph position="7"> A measure of the efficiency provided by the probabilistic prediction mechanism is the parser's prediction ratio, the ratio of edges predicted to edges necessary for a correct  parse. A perfect prediction ratio is 1:1, i.e. every edge predicted is used in the eventual parse. However, since there is ambiguity in the input sentences, a 1:1 prediction ratio is not likely to be achieved. Picky's prediction ratio is less than 3:1, and its ratio of predicted edges to completed edges is nearly 1:1. Thus, although the prediction ratio is not perfect, on average for every edge that is predicted one completed constituent results.</Paragraph>
      <Paragraph position="8"> Note that the prediction ratio is much lower in phase I (1.5:1) and phase II (2.2:1) than in phase III (7:1).</Paragraph>
      <Paragraph position="9"> This is due to the accuracy of the probabilistic prediction model used in the first two phases, and the deficiencies of the heuristic model used in final phase. Further efficiency can be gained either by limiting the amount of search which is performed in phase III before a sentence is deemed ungrammatical or by improving the heuristic prediction model.</Paragraph>
      <Paragraph position="10"> Since Picky has the power of a pure bottom-up parser, it would be useful to compare it's performance and efficiency to that of a probabilistic bottom-up parser. However, an implementation of a probabilistic bottom-up parser using the same grammar produces on average over 1000 constituents for each sentence, generating over 15,000 edges without generating a parse at all! This supports our claim that exhaustive CKY-like parsing algorithms are not feasible when probabilistic models are applied to them.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML