File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-2066_evalu.xml
Size: 5,869 bytes
Last Modified: 2025-10-06 13:59:44
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2066"> <Title>Mildly Non-Projective Dependency Structures</Title> <Section position="8" start_page="511" end_page="512" type="evalu"> <SectionTitle> 4.2 Results </SectionTitle> <Paragraph position="0"> The results of our experiments are given in Table 1.</Paragraph> <Paragraph position="1"> For the binary constraints (planarity, well-nestedness), we simply report the number and percentage of structures in each data set that satisfy the constraint. For the parametric constraints (gap degree, edge degree), we report the number and percentage of structures having degree d (d NAK 0), where degree 0 is equivalent (for both gap degree and edge degree) to projectivity.</Paragraph> <Paragraph position="2"> For DDT, we see that about 15a37of all analyses are non-projective. The minimal degree of non-projectivity required to cover all of the data is 2 in the case of gap degree and 4 in the case of edge degree.</Paragraph> <Paragraph position="3"> For both measures, the number of structures drops quickly as the degree increases. (As an example, only 7 or 0:17a37of the analyses in DDT have gap 4A total number of 17 analyses in DDT were excluded because they either had more than one root node, or violated the indegree constraint. (Both cases are annotation errors.) degree 2.) Regarding the binary constraints, we find that planarity accounts for slightly more than the projective structures (86:41a37of the data is planar), while almost all structures in DDT (99:89a37) meet the well-nestedness constraint. The difference between the two constraints becomes clearer when we base the figures on the set of non-projective structures only: out of these, less than 10a37are planar, while more than 99a37are well-nested.</Paragraph> <Paragraph position="4"> For PDT, both the number of non-projective structures (around 23a37) and the minimal degrees of non-projectivity required to cover the full data (gap degree 4 and edge degree 6) are higher than in DDT. The proportion of planar analyses is smaller than in DDT if we base it on the set of all structures (82:16a37), but significantly larger when based on the set of non-projective structures only (22:93a37).</Paragraph> <Paragraph position="5"> However, this is still very far from the well-nestedness constraint, which has almost perfect coverage on both data sets.</Paragraph> <Section position="1" start_page="511" end_page="512" type="sub_section"> <SectionTitle> 4.3 Discussion </SectionTitle> <Paragraph position="0"> As a general result, our experiments confirm previous studies on non-projective dependency parsing (Nivre and Nilsson, 2005; Hall and Novak, 2005; McDonald and Pereira, 2006): The phenomenon of non-projectivity cannot be ignored without also ignoring a significant portion of real-world data (around 15a37for DDT, and 23a37for PDT). At the same time, already a small step beyond projectivity accounts for almost all of the structures occurring in these treebanks.</Paragraph> <Paragraph position="1"> More specifically, we find that already an edge degree restriction of d DC4 1 covers 98:24a37of DDT and 99:54a37 of PDT, while the same restriction on the gap degree scale achieves a coverage of 99:84a37(DDT) and 99:57a37(PDT). Together with the previous evidence that both measures also have computational advantages, this provides a strong indication for the usefulness of these constraints in the context of non-projective dependency parsing. When we compare the two graded constraints to each other, we find that the gap degree measure partitions the data into less and larger clusters than the edge degree, which may be an advantage in the context of using the degree constraints as features in a data-driven approach towards parsing. However, our purely quantitative experiments cannot answer the question, which of the two measures yields the more informative clusters.</Paragraph> <Paragraph position="2"> The planarity constraint appears to be of little use as a generalization of projectivity: enforcing it excludes more than 75a37 of the non-projective data in PDT, and 90a37of the data in DDT. The relatively large difference in coverage between the two treebanks may at least partially be explained with their different annotation schemes for sentence-final punctuation. In DDT, sentence-final punctuation marks are annotated as dependents of the main verb of a dependency nexus. This, as we have discussed above, places severe restrictions on permitted forms of non-projectivity in the remaining sentence, as every discontinuity that includes the main verb must also include the dependent punctuation marks. On the other hand, in PDT, a sentence-final punctuation mark is annotated as a separate root node with no dependents. This scheme does not restrict the remaining discontinuities at all.</Paragraph> <Paragraph position="3"> In contrast to planarity, the well-nestedness constraint appears to constitute a very attractive extension of projectivity. For one thing, the almost perfect coverage of well-nestedness on DDT and PDT (99:89a37) could by no means be expected on purely combinatorial grounds--only 7a37 of all possible dependency structures for sentences of length 17 (the average sentence length in PDT), and only slightly more than 5a37of all possible dependency structures for sentences of length 18 (the average sentence length in DDT) are well-nested.5 Moreover, a cursory inspection of the few problematic cases in DDT indicates that violations of the well-nestedness constraint may, at least in part, be due to properties of the annotation scheme, such as the analysis of punctuation in quotations. However, a more detailed analysis of the data from both tree-banks is needed before any stronger conclusions can be drawn concerning well-nestedness.</Paragraph> </Section> </Section> class="xml-element"></Paper>