File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-2925_concl.xml

Size: 4,888 bytes

Last Modified: 2025-10-06 13:55:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2925">
  <Title>Projective Dependency Parsing with Perceptron</Title>
  <Section position="7" start_page="183" end_page="184" type="concl">
    <SectionTitle>
5 Analysis and Conclusions
</SectionTitle>
    <Paragraph position="0"> It is difficult to explain the difference in performance across languages. Nevertheless, we have identified  curacy is caused by the projectivity assumption made by the parser. UAS : unlabeled attachment score. LAS : labeled attachment score, the measure to compare systems in CoNLL-X. Bulgarian is excluded from overall scores.</Paragraph>
    <Paragraph position="1">  figurations. ph1 uses only phtoken at the head and modifier. ph2 extends ph1 with phdep. ph3 incorporates context features, namely phtctx at the head and modifier, and phdctx. ph4 extends ph3 with phdist. Finally, the final feature extraction function ph increases ph4 with phruntime.</Paragraph>
    <Paragraph position="2"> four generic factors that we believe caused the most errors across all languages: Size of training sets: the relation between the amount of training data and performance is strongly supported in learning theory. We saw the same relation in this evaluation: for Turkish, Arabic, and Slovene, languages with limited number of training sentences, our system obtains accuracies below 70%. However, one can not argue that the training size is the only cause of errors: Czech has the largest training set, and our accuracy is also below 70%.</Paragraph>
    <Paragraph position="3"> Modeling large distance dependencies: even though we include features to model the distance between two dependency words (phdist), our analysis indicates that these features fail to capture all the intricacies that exist in large-distance dependencies. Table 7 shows that, for the two languages analyzed, the system performance decreases sharply as the distance between dependency tokens increases.</Paragraph>
    <Paragraph position="4">  Modeling context: many attachment decisions, e.g.</Paragraph>
    <Paragraph position="5"> prepositional attachment, depend on additional context outside of the two dependency tokens. To address this issue, we have included in our model features to capture context, both static (phdctx and phtctx) and dynamic (phruntime). Nevertheless, our error analysis indicates that our model is not rich enough to capture the context required to address complex dependencies. All the top 5 focus words with the majority of errors for Spanish and Portuguese - &amp;quot;y&amp;quot;, &amp;quot;de&amp;quot;, &amp;quot;a&amp;quot;, &amp;quot;en&amp;quot;, and &amp;quot;que&amp;quot; for Spanish, and &amp;quot;em&amp;quot;, &amp;quot;de&amp;quot;, &amp;quot;a&amp;quot;, &amp;quot;e&amp;quot;, and &amp;quot;para&amp;quot; for Portuguese - indicate complex dependencies such as prepositional attachments or coordinations.</Paragraph>
    <Paragraph position="6"> Projectivity assumption: Dutch is the language with most crossing dependencies in this evaluation, and the accuracy we obtain is below 70%.</Paragraph>
    <Paragraph position="7"> On the Degree of Lexicalization We conclude the error analysis of our model with a look at the degree of lexicalization in our model. A quick analysis of our model on the test data indicates that only 34.80% of the dependencies for Spanish and 42.94% of the dependencies for Portuguese are fully lexicalized, i.e. both the head and modifier words appear in the model feature set (see Table 8). There are two reasons that cause our model to be largely unlexicalized: (a) in order to keep training times reasonable we performed heavy filtering of all features based on their frequency, which eliminates many lexicalized features from the final model, and (b) due to the small size of most of the training corpora, most lexicalized features simply do not appear in the testing section. Considering these results, a reasonable question to ask is: how much are we losing because of this lack of lexical information? We give an approximate answer by analyzing the percentage of fully-lexicalized dependencies that are correctly parsed by our model. Assuming that our model scales well, the accuracy on fully-lexicalized dependencies is an indication for the gain (or loss) to be had from lexicalization. Our model parses fully-lexicalized dependencies with an  accuracy of 74.81% LAS for Spanish (2.35% lower than the overall score) and of 83.77% LAS for Portuguese (0.40% higher than the overall score). This analysis indicates that our model has limited gains (if any) from lexicalization.</Paragraph>
    <Paragraph position="8"> In order to improve the quality of our dependency parser we will focus on previously reported issues that can be addressed by a parsing model: large-distance dependencies, better modeling of context, and non-projective parsing algorithms.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML