File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/90/c90-3088_abstr.xml
Size: 2,876 bytes
Last Modified: 2025-10-06 13:47:00
<?xml version="1.0" standalone="yes"?> <Paper uid="C90-3088"> <Title>Parsing Long English Sentences with Pattern Rules</Title> <Section position="1" start_page="0" end_page="0" type="abstr"> <SectionTitle> Abstract </SectionTitle> <Paragraph position="0"> In machine translation, parsing of long English sentences still causes some problems, whereas for short sentences a good machine translation system usually can generate readable translations. In this paper a practical method is presented for parsing long English sentences of some patterns. The rules for the patterns are treated separately from the augmented context free grammar, where each context free grammar rule is augmented by some syntactic functions and semantic functions. The rules for patterns and augmented context free grammar are complimentary to each other. In this way long English sentences covered by the patterns can be parsed efficiently.</Paragraph> <Paragraph position="1"> i. Introduction A long English sentence, from the parsing point of view, is defined as a sentence which has complicated syntactic structure or has too many words in it. Some factors which may contribute to the syntactic complication are words with multiple part-of-speeches, conjunctions, prepositional phrases, and commas, since the number of possible syntactic structures of a sentence grows with the factors, it is not easy for a machine translation system to pick a right syntactic structure, based on syntactic knowledge and a little semantic knowledge\[l\], among the large set of possible syntactic structures generated by the parser and the parsing time increases as well, due to the construction of many possible syntactic structures.</Paragraph> <Paragraph position="2"> To put it in another way, sentence parsing is a searching problem. The parsing time increases exponentially as the branching factor, reflecting complicated syntactic structure, and searching depth, reflecting actual sentence length, increase. In some-path bottom-up parsing\[2\], reducing branching factor or using beam search method \[3\]\[4\] to restrict the value of branching factor may decrease parsing time. However, basically, the parsing mechanism is still exponential.</Paragraph> <Paragraph position="3"> As an example, for the Engllsh-Japanese machine translation system, ATLAS II \[5\], in translating the corpus of 220 sentences selected from software manuals and papers, among the English sentences with usable translation, the average number of words of the sentences is 33.5&quot;. For sentences with translation that needs some postediting and sentences with translation that can not be used, the average sentence lengths are 45.7 and 46.8 words respectively. In order to have better performance, a machine translation system should be able to translate sentences of reasonable length.</Paragraph> </Section> class="xml-element"></Paper>