File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-1087_evalu.xml
Size: 5,732 bytes
Last Modified: 2025-10-06 13:59:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1087"> <Title>Noun Phrase Chunking in Hebrew Influence of Lexical and Morphological Features</Title> <Section position="9" start_page="693" end_page="695" type="evalu"> <SectionTitle> 5.5 Results </SectionTitle> <Paragraph position="0"> We discuss the results of the WP and WPNC experiments in details, and also provide the results for the WPG (using the Gender feature), and ALL (using all available morphological features) experiments, and P (using only PoS tags).</Paragraph> <Paragraph position="1"> As can be seen in Table 4, lexical information is very important: augmenting the PoS tag with lexical information boosted the F-measure from 77.88 to 92.44. The addition of the extra morphological features of Construct and Number yields another increase in performance, resulting in a final F-measure of 93.2%. Note that the effect of these morphological features on the over-all accuracy (the number of BIO tagged cor- null rectly) is minimal (Table 5), yet the effect on the precision and recall is much more significant. It is also interesting to note that the Gender feature hurts performance, even though Hebrew has agreement on both Number and Gender. We do not have a good explanation for this observation - but we are currently verifying the consistency of the gender annotation in the corpus (in particular, the effect of the unmarked gender tag). We performed the WP and WPNC experiment on two forms of the corpus: (1) WP,WPNC using the manually tagged morphological features included in the TreeBank and (2) WPE, WPNCE using the results of our automatic morphological analyzer, which includes about 10% errors (both in PoS and morphological features). With the manual morphology tags, the final F-measure is 93.20, while it is 91.40 with noise. Interestingly, the improvement brought by adding morphological features to chunking in the noisy case (WPNCE) is almost 3.0 F-measure points (as opposed to 0.758 for the &quot;clean&quot; morphology case WPNC).</Paragraph> <Section position="1" start_page="694" end_page="695" type="sub_section"> <SectionTitle> 5.6 Error Analysis and the Effect of Morphological Features </SectionTitle> <Paragraph position="0"> We performed detailed error analysis on the WPNC results for the entire corpus. At the individual token level, Nouns and Conjunctions caused the most confusion, followed by Adverbs and Adjectives. Table 6 presents the confusion matrix for all POSs with a substantial amount of errors. IO means that the correct chunk tag was I, but the system classified it as O. By examining the errors on the chunks level, we identified 7 common classes of errors: Conjunction related errors: bracketing &quot;[a] and [b]&quot; instead of &quot;[a and b]&quot; and vice versa. Split errors: bracketing [a][b] instead of [a b] Merge errors: bracketing [a b] instead of [a][b] Short errors: bracketing &quot;a [b]&quot; or &quot;[a] b&quot; instead of [a b] Long errors: bracketing &quot;[a b]&quot; instead of &quot;[a] b&quot; or &quot;a [b]&quot; Whole Chunk errors: either missing a whole chunk, or bracketing something which doesn't overlap with a chunk at all (extra chunk).</Paragraph> <Paragraph position="1"> Missing/ExtraToken errors: this is a generalized form of conjunction errors: either &quot;[a] T [b]&quot; instead of &quot;[a T b]&quot; or vice versa, where T is a single token. The most frequent of such words (other than the conjuncts) was - the possessive '$el'.</Paragraph> <Paragraph position="2"> Table 6. WPNC Confusion Matrix The data in Table 6 suggests that Adverbs and Adjectives related errors are mostly of the &quot;short&quot; or &quot;long&quot; types, while the Noun (including proper names and pronouns) related errors are of the &quot;split&quot; or &quot;merge&quot; types. The most frequent error type was conjunction related, closely followed by split and merge.</Paragraph> <Paragraph position="3"> Much less significant errors were cases of extra Adverbs or Adjectives at the end of the chunk, and missing adverbs before or after the chunk.</Paragraph> <Paragraph position="4"> Conjunctions are a major source of errors for English chunking as well (Ramshaw and Marcus, 1995, Cardie and Pierce, 1998)9, and we plan to address them in future work. The split and merge errors are related to argument structure, which can be more complicated in Hebrew than in English, because of possible null equatives. The toolong and too-short errors were mostly attachment related. Most of the errors are related to linguistic phenomena that cannot be inferred by the localized context used in our SVM encoding. We examine the types of errors that the addition of 9 Although base-NPs are by definition non-recursive, they may still contain CCs when the coordinators are 'trapped': &quot;[securities and exchange commission]&quot; or conjunctions of adjectives.</Paragraph> <Paragraph position="5"> Number and Construct features fixed. Table 7 summarizes this information.</Paragraph> <Paragraph position="6"> tion on most frequent error classes The error classes most affected by the number and construct information were split and merge WPNC has a tendency of splitting chunks, which resulted in some unjustified splits, but compensates this by fixing over a third of the merging mistakes. This result makes sense - construct and local agreement information can aid in the identification of predicate boundaries. This confirms our original intuition that morphological features do help in identifying boundaries of NP chunks.</Paragraph> </Section> </Section> class="xml-element"></Paper>