File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/00/c00-1081_relat.xml
Size: 3,549 bytes
Last Modified: 2025-10-06 14:15:34
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-1081"> <Title>A Stochastic Parser Based on a Structural Word Prediction Model Shinsuke MORI, Masafumi NISHIMURA, Nobuyasu ITOH,</Title> <Section position="6" start_page="562" end_page="563" type="relat"> <SectionTitle> 5 Related Works </SectionTitle> <Paragraph position="0"> lIistorica.lly, structures of natural languages have been described by a context-free grammar a.nd all\]biguities have been resolved by parsers based on a context-free grammar (Fujisaki et al., 1989). In reeenl, years, some attempts have been made in the area of parsing by a tinite state model (Otlazer, 1999) etc. Our parser is also based on a finite state model.</Paragraph> <Paragraph position="1"> Unlike these models, we focused on reports on a limit on language structure caused by the capacity our memory (Yngve, 1960) (Miller, 19561. Thus our i~lg accuracy.</Paragraph> <Paragraph position="2"> model is psycholinguistically more al)propriate.</Paragraph> <Paragraph position="3"> Recently, in the area of parsers based oll a. stochastic context-fi:ee grammar (SCFG), some researchers have pointed out the importance of t.he lexicon and proposed lexiealized models (Charniak, 1997; Collins, 1997). 111 these papers, they reported significant improvement of parsing accuracy. Taking these reports into account, we introduced a method of pa.rlJal lexicalization and reported significant im-provement of parsing accuracy. Our lexicalization method is also a.pplicable to a. SCFG-based parser and improves its parsing accuracy.</Paragraph> <Paragraph position="4"> The model we present in this pal)er is a generatire stochastic language model. Chelba and aelinek 119981 presented a similar model. In their model, each word is predicted tY=om two right-most head words regardless of dependency rela.tion between these head words and the word. Eisner (\[996) also presented a. st;ochastie structura.1 language model, in which ea.ch word is predicted tY=om its head word and the nearest one. This model is very similar to the parser presented by Collins 11.9961. The greatest difference between our model and these models is in that our model predicts the next word from the head words, or partial parse trees, depending on it.</Paragraph> <Paragraph position="5"> Clearly, it is not always two right-most head words that have dependency relation with the next word.</Paragraph> <Paragraph position="6"> It. follows that our model is linguistically more appropirate. null There have been some attempts at stochastic Japal, ese parser (llaruno et al., 1998) (l&quot;ujio and Matsmnoto, 19981 (Mori and Naga.o, 1.998). These Japanese parsers are based on a unit called bunsetsu, a sequence of one or more content words followed by zero or more traction words. The parsers take a sequence of units and outputs dependency relations between them. Unlike these parsers, our model de- null scribes dependencies between words; thus our model can easily be extended to other languages. As tbr the accuracy, although a direct comparison is not easy between our parser (89.9%; 1.,072 sentences) and these parsers (82% - 85%; 50,000 - 190,000 sentenees) because of the difference of the units and the corpus, our parser is one of the state-of-the-art parsers \[br Japanese language. It should be noted that ore: model describes relations among three or more units (case frame, consecutive dependency relations, etc.); thus our model benefits a greater deal from increase ot.' corpus size.</Paragraph> </Section> class="xml-element"></Paper>