File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-1030_intro.xml
Size: 3,574 bytes
Last Modified: 2025-10-06 14:05:12
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-1030"> <Title>An Empirical Study on Rule Granularity and Unification Interleaving Toward an Efficient Unification-Based Parsing System</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Uuifieation-based framework bins been an area of active research in natural language processing. Unification, wbich is the primary operation of ibis frame.work, provides a kind of constraint-checking mechanism for nlerging varioas information sources, sllcb as syntax, semantics, and pragmatics. The computational inefficiency of unification, however, precludes tile development of large practical NLP systems, although the framework has many attractiw~ theoretical properties.</Paragraph> <Paragraph position="1"> The efforts made to improve tile efficiency of a uriitication-ba.sed parsing system can be classified into four categories.</Paragraph> <Paragraph position="2"> There bave been well-known efficient CFG parsing algorithms such as CKY \[Aho mid UllHnm, 77\], Ear~ ley \[Earley, 70\], CtIAffl' (Kay, 80\], eatd I,R \[Aho and Ullmaa L 77\] \['t'omita, 86\]. There have also been several recent in-depth studies into efficient graph unification algoritbms, whose main concerns have been either avoiding irrelevant copies of l)AGs \[Karttunen and Kay, 85\] \[Pereira, 85\] \[Karttun .... 86\] \[Wroblewski, 87\] \[Godden, 90\] \[Kogure, 90\] \[Tomabechi, 91\] \[E1aele, 91\], or the exhaustive expansion of disjunctions into their disjunctive normal forms \[Kasper, 87\] \[Eisele mad l)Srre, 88\] \[Maxwell and Kaplmh 89\] \[l)arre and l~i,~ele, 90\] ickier, 901 \[Nat ...... 91\]. There has, however, been litth: discussion regarding the optimal representation of a grammar, or linguistic knowledge, in the unification-based framework, from tile engineering point of view. Grammar organization is highly flexible, as tile unification-based framework uses two different forms of knowledge representation; atomic phrase structure rules and feature structure descriptions. Method selection greatly at&quot; facts both the computational elficieney and the maiutenauce cost of the system. There luL~ also been little discussion regarding optimal interaction between the CFG parsing process and the unification process in unificatlon-based parsing, which also greatly af|~ct~; overall performance.</Paragraph> <Paragraph position="3"> Here we introduce the notion of granularity, and suggest mcdium-gra~ued phrase structure rules, in which morph~.syutactic specifications in the teature descriptious are expallded into phrase structure rules.</Paragraph> <Paragraph position="4"> We claim that it reduce the computational loads of unification without intractably increasing tim lmulber of rules, and it is optimal ill tile sense that it satis ties both ettleiency and maintainability. We also suggest late unification as another ~lution to tim COl)ylug problem, as it avoids unnecessary copies of irrel evant subparses by delaying unification mttil a COlnph:te CI,'G parse is found.</Paragraph> <Paragraph position="5"> In tile following sections, the design and iml)lemen tatiun of tim medimn-grained phrase structure rules in explailmd, then the implementation of the late uni: tication is illustrated, anti finally the elfectiveness of the proposed nlethods is proven in experiments.</Paragraph> <Paragraph position="6"> constraints in tile phrase structure rules aJtd the feature descriptions</Paragraph> </Section> class="xml-element"></Paper>