File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/85/p85-1017_intro.xml
Size: 4,774 bytes
Last Modified: 2025-10-06 14:04:26
<?xml version="1.0" standalone="yes"?> <Paper uid="P85-1017"> <Title>A Structure-Sharing Representation for Unification-Based Grammar Formalisms</Title> <Section position="4" start_page="0" end_page="137" type="intro"> <SectionTitle> 2 Grammars with Unification </SectionTitle> <Paragraph position="0"> The data representation discussed in this paper is applicable, with but minor changes, to a variety of grammar formalisms based on unification, such as definite-clause grammars \[61, functional-unification grammar \[4\], lexical-fimctional grammar \[21 and PATR-II \[8i. For the sake of concreteness, however, our discussion will be in terms of the PATR-II formalism. null The basic idea of unification-ba.se, I grammar formalisms is very simple. As with context-free ~rammars. granlmar rules stafe how phrase types con,blue t(, yiehl ol her phr:~se types. \[h,t where:m a context-free grammar allows only a finite nl,mber ,~f predefined atomic phrase types or nonlerminal.~, a unification-based grammar will in general define implicitly an infinity of phra.se types.</Paragraph> <Paragraph position="1"> A phrase type is defined by a net of constraints. A grammar m,le is a set of ronsl.rnints b,,twe,,u the type .\,~ .f a phr:me ;lnd the types .\', ...... \', of its ,'on..,iitm,nis. The rt,h, niay I,, applied It, It,. analysis ~,f a -,Ir'irlg s,, ;is the c<mc;ih,nalion of rcmslil.m'nls &quot;~1,.....%t if and <rely if tho types ,,f the .~i arc' rOml);~iible with the lypes .\', ;tml the constraints in the ruh,.</Paragraph> <Paragraph position="2"> Unification is the operation that determines whether two types are compauble by buihling the most general type compatible with both.</Paragraph> <Paragraph position="3"> if the constramls arc, Cqllationn I)elween at tril-iI,,s (~f phra.se types, ;is is the ,'ase in PATII-II. i~,, ltlir:l~e l.x t)ecan lie unilh,d wlH,iI~,~,l,r ih,,y ,Io liol ;l~-.ii.rli ,li~iinci ~,;ihie.~ I.o Ilie ,~;llllC/, al, l.rillilh,. The illlilil;il illii i~, lhcll jil.~l Ih~, ~'lliijunction (,sOt Ilili(lll) Of the rorreslXlading sets of COll.~trailll~, lsl.</Paragraph> <Paragraph position="4"> Ilere is a sample rule, in a simplified version (if the PATR-</Paragraph> <Paragraph position="6"> This rule may be read as stating that a phrase of type Xo can be the concatenation of a phrase of type Xt and a phrase of type X:, provided that the attribute equations of the rule are satisfied if the phrases axe substituted for their types.</Paragraph> <Paragraph position="7"> The equations state that phrases of types X0, Xt, and X: have categories S, NP, and VP, respectively, that types Xt and X~ have the same agreement value, that types Xo and X2 have the same translation, and that the first argument of X0's translation is the translation of Xt.</Paragraph> <Paragraph position="8"> Formally, the expressions of the form (it..-I,,,) used in attribute equations axe path8 and each I~ is a label.</Paragraph> <Paragraph position="9"> When all the phrase types in a rule axe given constant cat (category} values by the rule, we can use an abbreviated notation in which the phrase type vaxiables X~ axe replaced by their category values and the category-setting equations are omitted. For example, rule (1) may be written as</Paragraph> <Paragraph position="11"> In existing PATR-II implementations, phrase types are not actually represented by their sets of defining equations.</Paragraph> <Paragraph position="12"> Instead, they are represented by symbolic solutions of the equations in the form of directed acyclic graphs (dacs) with arcs labeled by the attributes used in the equations. Dag nodes represent the values of attributes and an arc labeled by l goes from node m to node n if and only if, according to the equations, the value represented by m has n as the value of its t attribute \[~\].</Paragraph> <Paragraph position="13"> A dag node (and by extension a dag) is said to be atomic if it represents a constant value; complex if it has some out-going arcs; and a leaf if is is neither atomic or complex, that is, if it represents an as yet completely undetermined value.</Paragraph> <Paragraph position="14"> The domain dora(d) of a complex dag d is the set of labels on arcs leaving the top node of d. Given a dag d and a label l E dora(d) we denote by d/I the subdag of d at the end of the arc labeled I from the top node of d. By extension, for any path p whose labels are in the domains of the appropriate subdags, d/p represents the subdag of d at the end of path p from the root of d.</Paragraph> <Paragraph position="15"> For uniformity, lexical entries and grammar rules are also represented by appropriate dags. For example, the dag for rule (t) is shown in Figure 1.</Paragraph> </Section> class="xml-element"></Paper>