File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/94/h94-1048_evalu.xml

Size: 4,292 bytes

Last Modified: 2025-10-06 14:00:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="H94-1048">
  <Title>A Maximum Entropy Model for Prepositional Phrase Attachment</Title>
  <Section position="5" start_page="252" end_page="253" type="evalu">
    <SectionTitle>
4. Results
</SectionTitle>
    <Paragraph position="0"> We applied the Maximum Entropy model to sentences from two corpora, the I.B.M. Computer Manuals Data, annotated by Univ. of Lancaster, and the Wall St. Journal Data, annotated by Univ. of Penn. The size of the training sets, test sets, and the results are shown in Tables 1 &amp; 2.</Paragraph>
    <Paragraph position="1"> The experiments in Table 2 differ in the following manner: &amp;quot;Words Only&amp;quot; The search space P begins with all possible n-gram word features with n being 1, 2, 3,or 4; this feature set does not grow during the feature search.</Paragraph>
    <Paragraph position="2"> &amp;quot;Classes Only&amp;quot; The search space P begins with only unigram class features, and grows by dynamically contructing class n-gram questions as described earlier.</Paragraph>
    <Paragraph position="3"> &amp;quot;Word and Classes&amp;quot; The search space P begins with all possible n-gram word features and unigram class features, and grows by adding class questions (as described earlier).</Paragraph>
    <Paragraph position="4"> The results in Table 2 are achieved in the neighborhood of about 200 features. As can be seen in Figure 1, performance improves quickly as features are added and improves rather very slowly after the 60-th feature. The performance is fairly close for the various feature sets when a sufficient number of features are added. We also compared these results to a decision tree grown on the same 4 head-word events. The same  mutual intbrmation bits were used for growing the decision trees. Table 3 gives the results on the same training and test data. The \]VIE models are slightly better than the decision tree models.</Paragraph>
    <Paragraph position="5"> For comparison, we obtained the PP-attachment performances of 3 treebanking experts on a set of 300 randomly selected test events from the WSJ corpus. In the first trial, they were given only the four head words to make the attachment decision, and in the next, they were given the headwords along with the sentence in which they occurred. Figure 3 shows an example of the head words test a. The results of the treebankers and the performance of the ME model on that same set are shown in Table 5. We also identified the set of 274 events on which treebankers, given the sentence, unanimously agreed. We defined this to be the truth set. We show in Table 6 the agreement on PP-attachment of the original WSJ treebank parses with this consensus set, the average performance of the 3 human experts with head words only, and the ME model.</Paragraph>
    <Paragraph position="6"> The WSJ treebank indicates the accuracy rate of our training data, the human performance indicates how much information is in the headwords, and the ME model is still a good 12  4 the key is N,V,N,N,V, N,N,N,N,V,V,N,V,N,N,N,V,N,V percentage points behind.</Paragraph>
    <Paragraph position="7"> Selection Order Feature (1) Preposition == &amp;quot;of&amp;quot; (2) Bit 2 of Head Noun == 0 (3) Preposition is &amp;quot;to&amp;quot; (4) Bit 12 of Head Noun == 1 (9) Head Noun == &amp;quot;million&amp;quot;, Preposition == &amp;quot;in&amp;quot; (30) Preposition == &amp;quot;to&amp;quot;, Bit 8 of Object == 1 (47) Preposition == &amp;quot;in&amp;quot;, Object == &amp;quot;months&amp;quot;  report milllion for quarter reflecting settlement of contracts carried all but one were injuries among workers had damage to building be damage to some uses variation of design cited example of district leads Pepsi in share trails Pepsi in sales risk conflict with U.S.</Paragraph>
    <Paragraph position="8"> risk conflict over plan oppose seating as delegate save some of plants introduced versions of cars lowered bids in anticipation oversees trading on Nasdaq gained 1 to 19  We also obtained the performances of 3 non-experts on a set of 200 randomly selected test events from the Computer Manuals corpus. In this trial, the participants made attachment decisions given only the four head words. The results are shown in Table 7.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML