File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-2029_concl.xml
Size: 2,035 bytes
Last Modified: 2025-10-06 13:55:24
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-2029"> <Title>The Benefit of Stochastic PP Attachment to a Rule-Based Parser</Title> <Section position="8" start_page="228" end_page="229" type="concl"> <SectionTitle> 6 Conclusions and future work </SectionTitle> <Paragraph position="0"> Corpus-based data has been shown to provide a significant benefit when used to guide a rule-based dependency parser of German, reducing the error rate for situated PP attachment by one third.</Paragraph> <Paragraph position="1"> Prepositions still remain the largest source of attachment errors; many reasons can be tracked down for individual errors, such as faulty POS tagging, misinterpreted global sentence structure, genuinely ambiguous constructions, failure of the attraction heuristics, or simply lack of processing time. However, considering that even human arbiters often agree only on 90% of PP attachments, the results appear promising. In particular, many attachment errors that strongly disagree with human intuition (such as in the example sentence) were in fact prevented. Thus, the addition of a corpus-based knowledge source to the system yielded a much greater benefit than could have been achieved with the same effort by writing individual constraints.</Paragraph> <Paragraph position="2"> One obvious further task is to improve our simple-minded model of lexical attraction. For instance, some remaining errors suggest that taking the kernel noun into account would yield a higher attachment precision; this will require a redesign of the extraction tools to keep the parameter space manageable. Also, other subordination types than 'PP' may benefit from similar knowledge; e.g., in many German sentences the roles of subject and object are syntactically ambiguous and can only be understood correctly through world knowledge.</Paragraph> <Paragraph position="3"> This is another area in which synergy between lexical attraction estimates and general symbolic rules appears possible.</Paragraph> </Section> class="xml-element"></Paper>