File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/95/e95-1031_intro.xml
Size: 4,068 bytes
Last Modified: 2025-10-06 14:05:52
<?xml version="1.0" standalone="yes"?> <Paper uid="E95-1031"> <Title>A Robust Parser Based on Syntactic Information</Title> <Section position="3" start_page="0" end_page="223" type="intro"> <SectionTitle> (NP The/dr </SectionTitle> <Paragraph position="0"/> <Paragraph position="2"> A robust parser is one that can analyze these extragrammaticalsentences without failure. However, if we try to preserve robustness by adding such rules whenever we encounter an extra-grammatical sentence, the rulebase will grow up rapidly, and thus processing and maintaining the excessive number of rules will become inefficient and impractical. Therefore, extragrammatical sentences should be handled by some recovery mechanism(s) rather than by a set of additional rules.</Paragraph> <Paragraph position="3"> Many researchers have attempted several techniques to deal with extragrammatical sentences such as Augmented Transition Network(ATN) (Kwasny and Sondheimer, 1981), network-based semantic grammar (Hendrix, 1977), partial pattern matching (Hayes and Mouradian, 1981), conceptual case frame (Schank et al., 1980), and multiple cooperating methods (Hayes and Carbonell, 1981). Above mentioned techniques take into account various semantic factors depending on specific domains on question in recovering extragrammatical sentences. W\]lereas they can provide even better solutions intrinsically, they are usually ad-hoc and are lack of extensibility. Therefore, it is important to recover extragrammatical sentences using syntactic factors only, which are independent of any particular system and any particular domain.</Paragraph> <Paragraph position="4"> Mellish (Mellish, 1989) introduced some chart-based techniques using only syntactic information for extragrammatical sentences. This technique has an advantage that there is no repeating work for the chart to prevent the parser from generating the same edge as the previously existed edge. Also, because the recovery process runs when a normal parser terminates unsuccessfully, the performance of the normal parser does not decrease in case of handling grammatical sentences. However, his experiment was not based on the errors in running texts but on artificial ones which were randomly generated by human. Moreover, only one word error was considered though several word errors can occur simultaneously in the running text. A general algorithm for least-errors recognition (Lyon, 1974), proposed by G. Lyon, is to find out the least number of errors necessary to successful parsing and recover them. Because this algorithm is also syntactically oriented and baaed on a chart, it has the same advantzrge as that of Mellish's parser. When the original parsing algorithm terminates unsuccessfully, the algorithm begins to assume errors of insertion, deletion and mutation of a word. For any input, including grammatical and extragrammatical sentences, this algorithm can generate the resultant parse tree. At the cost of the complete robustness, however, this algorithm degrades the efficiency of parsing, and generates many intermediate edges.</Paragraph> <Paragraph position="5"> In this paper, we present a robust parser with a recovery mechanism. We extend the general algorithm for least-errors recognition to adopt it as the recovery mechanism in our robust parser. Because our robust parser handle extragrammatical sentences with this syntactic information oriented recovery mechanism, it can be independent of a particular system or particular domain. Also, we present the heuristics to reduce the number of edges so that we can upgrade the performance of our parser.</Paragraph> <Paragraph position="6"> This paper is organized as follows : We first review a general algorithm for least-errors recognition. Then we present the extension of this algorithm, and the heuristics adopted by the robust parser. Next, we describe the implementation of the system and the result of the experiment of parsing real sentences. Finally, we make conclusion with future direction.</Paragraph> </Section> class="xml-element"></Paper>