File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/e93-1046_intro.xml
Size: 4,308 bytes
Last Modified: 2025-10-06 14:05:23
<?xml version="1.0" standalone="yes"?> <Paper uid="E93-1046"> <Title>Ambiguity resolution in a reductionistic parser *</Title> <Section position="3" start_page="0" end_page="394" type="intro"> <SectionTitle> FULLSTOP </SectionTitle> <Paragraph position="0"> *The development of ENGCG was supported by TEKES, the Finnish Technological Development Center, and a part of the work on Finite-state syntax has been supported by the Academy of Finland.</Paragraph> <Paragraph position="1"> In this type of analysis, each word gets a morphosyntactic analysis I.</Paragraph> <Paragraph position="2"> The present work is closely connected with two parsing formalisms, Constraint Grammar \[Karlsson, 1990; Karlsson et aI., 1991; Voutilainen et aI., 1992; Karlsson et aI., 1993\] and Finlte-state syntax as advocated by \[Koskenniemi, 1990; Tapanainen, 1991; Koskenniemi et al., 1992\]. The Constraint Grammar parser of English is a sequential modular system that assigns a shallow surface-true dependency-oriented functional analysis on running text, annotating each word with morphological and syntactic tags. The finite-state parser assigns a similar type of analysis, but it operates on all levels of ambiguity 2 in parallel rather than sequentially, enabling the grammarian to refer to all levels of structural description in a single uniform rule component. ENGCG, a wide-coverage English Constraint Grammar and lexicon, was written 1989-1992, and the system is currently available 3. The Constraint Grammar framework was proposed by Fred Karlsson, and the English Constraint Grammar was developed by Afro Voutilainen (lexicon, morphological disambiguation), Juha Heikkil~i (lexicon) and Arto Anttila (syntax). There are a few implementations lit consists of a base form, a morphological reading - part-of-speech, inflectional and other morphosyntactic features - and a syntactic-functional tag, flanked by '@'. ~Morphological, clause boundary, and syntactic automatically via E-mail by sending texts of up to 300 words to engcg@ling.Helsinki.FI. The reply will contain the analysis as well as information on usage and availability. Questions can also be directly sent to avoutila@ling.Helsinki.FI or to pt apanai@ling.Helsinki.FI.</Paragraph> <Paragraph position="3"> of the parser, and the latest, written in C by Pasi Tapanainen, analyses more than 1000 words per second on a Sun SparcStationl0, using a disambiguation grammar of some 1300 constraints.</Paragraph> <Paragraph position="4"> Intensive work within the finite-state framework was started by Tapanainen \[1991\] in 1990, and an operational parser was in existence the year after. The first nontrivial finite-state descriptions \[Koskenniemi etal., 1992\] were written by Voutilainen 1991-1992, and currently he is working on a comprehensive English grammar which is expected to reach a considerable degree of maturity by the end of 1994. Much of this emerging work is based on the ENGCG description, (e.g. the ENGTWOL lexicon is used as such); however, the design of the grammar has changed considerably, as will be seen below.</Paragraph> <Paragraph position="5"> We have two main theses. Firstly, knowledge-based reductionistic grammatical analysis will be facilitated rather than hindered by the introduction of (new) linguistically motivated and structurally resolvable distinctions into the parsing scheme, although this policy will increase the amount of ambiguity in the parser's input. Secondly, the amount of ambiguity in the input does not predict the speed of analysis, so introduction of new ambiguities in the input is not necessarily something to be avoided.</Paragraph> <Paragraph position="6"> Next, we present some observations about the ENGCG parser: the linguistic description would become more economic and accurate if all levels of structural description were available at the outset of reductionistic parsing (or disambiguation of alternative readings). In Section 3 we report on some early experiments with finite-state parsing. In Section 4 we sketch a more satisfactory functional dependency-oriented description. A more expressive representation implies more ambiguity in the input; in Section 5 it is shown, however, that even massive ambiguity need be no major problem for the parser.</Paragraph> </Section> class="xml-element"></Paper>