File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/89/p89-1015_intro.xml
Size: 2,110 bytes
Last Modified: 2025-10-06 14:04:51
<?xml version="1.0" standalone="yes"?> <Paper uid="P89-1015"> <Title>ACQUIRING DISAMBIGUATION RULES FROM TEXT</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> One of the most serious obstacles to developing parsers to effectively analyze unrestricted English is the difficulty of creating sufllciently comprehensive grammars. While it is possible to develop toy grammars for particular theoretically interesting problems, the sheer variety of forms in English together with the complexity of interaction that arises in a typical syntactic analyzer makes each enhancement of parser coverage increasingly difficult. There is no question that we are still quite far f~om syntactic analyzers that even begin to adequately model the grammatical variety of English. To go beyond the current generation of hand built grAmrnars for syntactic analysis it will be necessary to develop means of acquiring some of the needed grammatical information from the regularities that appear in large corpora of naturally occurring text.</Paragraph> <Paragraph position="1"> This paper describes an implemented training procedure for automatically acquiring symbolic rules for a deterministic parser on the basis of unrestricted textual input. In particular, I describe experiments in automatically acquiring a set of rules for disambiguation of lexical category (part of speech). Performance of the acquired rule set is much better than the set of rules for lexical disambiguation written for the parser by hand over a period of several rules; the error rate is approximately half that of the hand written rules. Furthermore, the error rate is comparable to recent probabilistic approaches such as Church (1987) and Garside, Leech and Sampson (1987). The current approach has the added advantage that, since the rules acquired depend on the parser's grammar in general, independent improvements in other modules of the parser can lead to improvement in the performance of the disambiguation component.</Paragraph> </Section> class="xml-element"></Paper>