File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/88/c88-1056_metho.xml

Size: 17,432 bytes

Last Modified: 2025-10-06 14:12:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="C88-1056">
  <Title>THE SOFTWARE A high-level language FUNDPL (Functional Dependency Parsing Language) was designed for parsing locally</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ILK',AI',LY ~V~NS\] &lt;{%{i~kq AND DEPENDENC~I PIHRSING
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
SITRA Foundation
</SectionTitle>
    <Paragraph position="0"> .P.Oo Plox 329~ 00121 Helsinki~ VINIAND 'BJ-::~ paper desc\[J.~s the notion of \]pcall.y gove~:ned t~:ees as a n&lt;x\]el of sttuCtu17ally \[estrfcted dependency st~:uctures of sentenc.eSo 2~n abstract umchine and its supporting softwa~:e to*. the building of local ly goqerned t~:ees is intr_(~iuced. The rest of the paper dis~:usse,q \[.ew uuaM\]iguous~ ~&lt;-'\].\]-for,~d local\]y governed i:~:ees can be parsed ill l.i\[~ea~: tia~ ~{~en cxertain ~'tructural ~x~nstr~int's a~e in fuzce o 'i~le phrase st~.ucture 1.ttle is a widely used primitive m)tation in literature when synt~Ictic structures of ~euteuces d~dege discussed in a rJ.gorous ~f~nner o A n~ajority o;5 syntactic ~;:~rsJ.ng pr~jratt~ also utilize phrase st~.0cture rules in one vary or anotherdeg Phrase: structure rules reflect the i~Kl-diate {79nstihuent analysis of sentencesdeg Fach :1lle names a ~x)nstituent 6rod its specified ordered e\].~-\[~nts on the lower levc\].o ~ts primitive ,~\]ations are {~erefo~'e part.~of-a--.~ho\]e and ooncatenationo In parsing, phrase structure rules arc used to s~u.'ch a hierarchical c~nstituen~ orgsnizatiop of the word string of a sentence o Phrase structure ~\]es discover the hierarchicsl organization of a sentence~ but they do not tell. whdch words are the heads of the phrases (save file X-trot theory /Jackendoff 1977/) nor de they further: sf?e&lt;cify %~le ty\[x~s of the structural relationsdeg Dependlency grau~-/rs C/ in cxontrast ~ indicate the bi1~ary re).atious that hold between the words in sentences /~{@ys \].964 ~ C/-/aifamn \] 965, Robinson 1970, 2~0m\]erson 1971~ Hellwig 1986, St~trostm 1986/. Neither non. 4~=~rmim!l sy~i~ol.s nor phrase struct\[~e rules have any ;cole to play because constituents are not looked for,, A p~ser which ~0ploys dependency rules (rather than pbras,; structure rules) nmkes the beads and the \[types of binding celations explicit, but does not indicate the hierarchical constituent tx\]nfigurations of ~sentences explicitlydeg We argue that dependency gra,~nars suit ~yet|-er than phrase structure rnles to non-oonfigt~ational~ free-word~rder languages~ \]\[nsofi~r as defxmdency r'elations are local (that is~ they hold bet~.en adjacent words or trees) and @~trttctive (that .i.s~ a recxx\]nized dependant is rex~oved p~o~,~)'tly from processing) deten~inistic parsing in 'line.ar ti~,e often resultsdeg Fig= la illustrates this point for a si~zple intransitive-verb Finnish sentence ~q, ienen ~-~jan &amp;quot;~iti t~uroi&amp;quot; (A/the ~sll boy's ~K)ther  Linear time is preserved in parsing also in many typical Finnish non-monotonous dependency trees if t ~s a default control ruler a word attenK0ts first to govern its left neigh|~ro ~is strategy is natural for Finnish as most iC/odifieL's are of prepositional typedeg This rule was already ini01J.cit Jn Figdeg la. (There are of cx)urse exceptions which must overrule this default strategy.</Paragraph>
    <Paragraph position="1"> For exan~gle, prepositions have their dependants, not their regents~ on their right side. ) Finnish sentences have typically SVO structure.</Paragraph>
    <Paragraph position="2"> Fig~ ib shows the l~rsing of an ordinary transitiveverb sentence &amp;quot;Pienen pojan ~\[iti lauloi hauskan laulu~l&amp;quot; (A/the small boy's r~ather sang a merry song). Parsing steps are indicated by the numbers between the words.</Paragraph>
    <Paragraph position="3"> This |roper elaborates the locality principle in de~endency parsing. First, we specify the ideas of local goverm~ent and locally governed treesdeg Then we describe a ~msic machine and its supporting software as an i~Kolementation of the locality principle for \]~rsing arbitrary locally governed trees. ~\]e parsing syste~l has been itKole/~ented for Finnish. Occasio~lly our parser invokes expensive search because no prerequisites restrict trees ( save locality ). We discuss how parsing can be speeded up into linear time if certain rmtural structural constraints are in force.</Paragraph>
  </Section>
  <Section position="3" start_page="0" end_page="275" type="metho">
    <SectionTitle>
LOCAL CaOVF/RNMENT r LOCALLY GOVERNFa3 'IR~ES, AND
DE~PENDENCY PARSING
</SectionTitle>
    <Paragraph position="0"> The ideological mlderpinning of local dependency parsing is to focus on adjacent word pairs and see if a binary dependency relation holds between them. The words of a sentence l~ve various attributes in our parserdeg Some of the attributes have been extracted by a Iro~l~bological preprocessor /J~ippinen and Ylilammi 1986/~ while others are tagged during the parsing process o Local C~)ver m~nt Let &lt;w I w2.. own&gt; be an ordercml list of wordsdeg We say that a ~K)rd wj locally governs another word wj if |j = i-1 or i+l anti w i R wa where R is a binary dependency relatlon such that w i is the governor (or the regent) of the pair and .wj is the de~dantdeg In other words, a word locally governs another one if they are adjacent (at the ~mt of the testing) and a dependency relation holds between them.</Paragraph>
    <Paragraph position="1">  The governor alone represents its government: once a local government has been established between two adjacent words the dependant is linked with the governor and disappears thereafter from sight. An elementary destructive processing step takes place, reducing the number of visible werds by one (shown by arrows in Fig. i).</Paragraph>
    <Paragraph position="2"> Government is transitive. If w i locally governs wj, and wj locally governs Wk, then w i governs w k. G6vernment-is also antisy~retric and irreflexive. Locally Governed Trees Due to the destructive processing step explained above, a governor gets a new neighbor immediately after a local government has been built. This new nelghbor qualifies for a local government as well. A single word may therefore locally govern a number of other words, and two initially distant words may later on establish a local government between themselves. If a word is the governor of several words simultaneously, we say that it governs a locally governed tree of depth one (LGT-I). In fact, we can view a (binary) local government as a LGT-I having just a single branch.</Paragraph>
    <Paragraph position="3"> LGT-I ' s are elementary trees. Relational trees which preserve the locality principle and can reach arbitrary depth are called locally governed trees (LGT). LGT's are defined recursively as follows: i. Any LGT-I is a LGT.</Paragraph>
    <Paragraph position="4"> ii. A tree formed by a word which locally governs LGT's is itself a LGT.</Paragraph>
    <Paragraph position="5"> \[~t &lt;w I w 2 ... Wn&gt; be a sequence of words. If there exists a LGT which governs all the words, the LGT is a parse tree of the words. Figure 1 portrays two parse trees.</Paragraph>
    <Section position="1" start_page="275" end_page="275" type="sub_section">
      <SectionTitle>
Parsing Strategy
</SectionTitle>
      <Paragraph position="0"> In the implemented parser the parsing strategy is based on the following two control principles: (i) parsing focuses first on the leftmost word (the initial word principle); (2) the parser always tries first to establish a focused word as a governor to its left neighbor and then shifts focus to the right neighbor (preferred direction principle).</Paragraph>
      <Paragraph position="1"> The resulting parsing strategy is a left-corner-up strategy. The strategy is tuned to efficiently bind prepositional attributes as dependants.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="275" end_page="275" type="metho">
    <SectionTitle>
TI~E MACHINE
</SectionTitle>
    <Paragraph position="0"> We have designed and implemented a parsing system for LGT's. The underlying abstract machine has one focus register and two stacks which bold the left and the right contexts of the focused word, respectively /Nelimarkka et al. 1985/. Locality is enforced by permitting a focused word to bind dependants only from top of either stack - the left stack being preferred.</Paragraph>
    <Paragraph position="1"> The machine has also instructions for contextual testing. These tests may penetrate the stacks.</Paragraph>
  </Section>
  <Section position="5" start_page="275" end_page="276" type="metho">
    <SectionTitle>
THE SOFTWARE
</SectionTitle>
    <Paragraph position="0"> A high-level language FUNDPL (Functional Dependency Parsing Language) was designed for parsing locally governed trees /J~ppinen et al. 1986b/. A precursor w~s a more procedural language DPL /Nelimarkka et al.</Paragraph>
    <Paragraph position="1"> 1985/. The system includes a con~piler and supporting progranming environment for the developmental work /Lehtola et al. 1985/.</Paragraph>
    <Paragraph position="2"> A grammar description has three parts in FUhDPL.</Paragraph>
    <Paragraph position="3"> The initial part declares data types. The second part describes valid binary dependency relations. For each named binary relation the user specifies valid word pairs using morphological and/or lexical attribute values. The notation permits concise use of boolean operations on attributes.</Paragraph>
    <Paragraph position="4">  The third part of a grammar description defines a set of functional schemata. Functional schemata have beth declarative and procedural readings. From the declarative point of view, functional schemata define a set of valid LGT-I's. Each schema describes a regent and its possible local governmentsdeg A local government is either mandatory or optional, and an optional one may recl~ By default the surface ordering of local gover~nents is free.</Paragraph>
    <Paragraph position="5"> Sometimes stringent ordering constraints exist l~t~een local governments; sometimes it is advantageous t~ give probabilistic information about the ordering of positionally free governments. Such structural information may be written in a schema.</Paragraph>
    <Paragraph position="6"> Schemata have also procedural reading which is yet another distinguishing feature from phrase structure rules. A schema actively controls the build-~up of the LGT-I it represents. From the preoedural viewpoint a schema monitors function calls of local governments using blackboard control regime /Valkonen et al. 1987/.</Paragraph>
  </Section>
  <Section position="6" start_page="276" end_page="276" type="metho">
    <SectionTitle>
THE SEARCH PROBLEM OF PARSING ARBITRARY LGTgS
</SectionTitle>
    <Paragraph position="0"> To discover a parse tree for an arbitrary If&amp;T is a complicated search process even in a bettom-up strategy (in top-down problems would be worse)deg The basic problem is this: how does an algorit~n know on which level in the hierarchy a given word belongs to? That is, when parsing proceeds from left to right and an attempt is made to establish the right neighbor of a governor as a dependant, the link is possible only if that word is not a governor of a yet incomplete T~To Our left-corner-up strategy occasionally has to invoke coraplex search for this reason.</Paragraph>
    <Paragraph position="1"> If a language constrains the structures of its possible LGT's, LGT's become computationally much more economical devices. The problem discussed above does not arise with constituent grammars and phrase structure rules because these rules indicate hierarchy implicitly through the naming of the constituents.</Paragraph>
  </Section>
  <Section position="7" start_page="276" end_page="276" type="metho">
    <SectionTitle>
CONSTRAINED LGT ' S
</SectionTitle>
    <Paragraph position="0"> Finnish is a highly inflectional, agglutinating language. Both verbs and nominals have numerous distinct surface forms which distinguish between different syntactic functions the words can have in sentences. Word forms carry, among other things, such syntactic information which in configurational languages is indicated by the precedence relation. Word order in Finnish is relatively free.</Paragraph>
    <Paragraph position="1"> The basic Finnish sentence configuration is SVO. a subject LGT is followed by a verb, an object LGT, and possible adverbial LGT's. Topicalization, wh-movement, and other movements create variations to this basic configuration.</Paragraph>
    <Paragraph position="2"> The shape of nominal LGT~s is markedly distorted.</Paragraph>
    <Paragraph position="3"> They have almost all modifiers on their left hand side forcing them to lean to the right. The most important modifiers are adjectival and genitivedeg Adjectival attributes modify the head noun iterativelyf as in the phrase (i).</Paragraph>
    <Paragraph position="4"> (i) Nuori pitk~ vieh~tt~v~ tytt6 Young tall charming girl Genitive attributes, themselves nominals~ im~dify head nominals recursively, as in the phrase (2).</Paragraph>
    <Paragraph position="5"> (2) X~tSn is~n tySnantajan auto Girl (gen) father (gen) employex (gem) car (A/the girl's father's esloloyer'S car) Other prepositional modifiers for nouns are quantifiers and demonstrative pronouns. Prepositional modifier types can be mixed (under certain restrictions) as in the phrase ( 3).</Paragraph>
    <Paragraph position="6"> (3) T~m~n nuoren vieh~tt~v~n tytb'n vanha kiero is~ This young charming girl (gen) old crooked father (This young charming girl's old crooked father) Prepesitionality of Finnish is also demonstrated by the fact that postpositions are common but preposition.3 rare. Nouns have also occasional postpositio,~ll nondnal modifiers, but these modifiers can be governed only by the maximal nominal heads of a LGT (the governors which fill the valencies of verbs) or by anolJler postpositionally modifying r~T. For example, the nominal phrase (4) (4) suuren ~niehen pieni auto talon takana big man (gen) small car (nora) house (gen) behind (a/the big man's small car behind the house) has the I~.I',. ~ shown in Fig. 2. The postpositionally n~xlifying adverbial LGT &amp;quot;talon takana&amp;quot; (behind the house) cannot modify the genitive attribute: *suuren miehen talon takana pieni auto.</Paragraph>
    <Paragraph position="7"> GonAttr auto AdvAt tr * .~-~-~ i</Paragraph>
  </Section>
  <Section position="8" start_page="276" end_page="276" type="metho">
    <SectionTitle>
AN EFFICIENT PARSING ALGORITHM FOR LGT'S
</SectionTitle>
    <Paragraph position="0"> The basic left-corner-up algorithm can be modified so that it hierarchically first builds nominal LGT' s without post0ositional modifiers, then LGT's governed by prepositions and postpesitions, then mgminal IGT's with postpositional modifying nominal LGT's, and finally the LGT governed by the finite verb. The structural constraints of LGT's prune search, and it can be p~oved that the algorithm 1~en parses unambiguous sentences in linear time. The following restrictions are assumed: i o Adjectives, quantifiers, and adverbs have nnly prepositional modifiers.</Paragraph>
    <Paragraph position="1"> ii. Nouns have postpositional modifiers only on the maximal level. On lower levels they have only prepositional modifiers.</Paragraph>
  </Section>
  <Section position="9" start_page="276" end_page="276" type="metho">
    <SectionTitle>
AMBIGUITY A~) WELL-FORM~DNESS
</SectionTitle>
    <Paragraph position="0"> The modified algorithm presumes that LGT' s are unambiguous. None of the bound dependants should not qualify as a dependant to any other governor than the one chosen. Because the algorithm removes dependants after binding, it cannot cope with alternative relations.</Paragraph>
    <Paragraph position="1"> Albeit rich morphology greatly helps to make unique distinctions between different binary relations in Finnish, it leaves some residual ambiguity. The most prominent example is caused by the genitive surface case. That ~mse signals either accusative case, the object of a ~,~entence, or possession. The governor of an adverbial n~i~ also be ambiguous. The basic algorithm solves ambi9%~ity by backtracking.</Paragraph>
    <Paragraph position="2"> In their &amp;quot;pure, form beth algorithms parse only well-formed \]~T's. 'Ilhere are, however, soa~.~ well known syntactic phenomena which cannot be represented by iGT's. TG-theory postulates oertain transformations which result in long-distance dependencies. In modern GB-theory tl~se displacement operations 9~) under the general rubric &amp;quot;move-alpha&amp;quot;.</Paragraph>
    <Paragraph position="3"> For exanple, certain fronting n~vements (wh-movement and topicalization) remove an element and may transport it across clause boundaries onto a landing site in the beginning of the main sentence. A I~T which originally was governed locally becomes distant to its governor and is no more within its reach.</Paragraph>
    <Paragraph position="4"> The algorithm can be augmented to handle long-distance fronting movements. At one point the algorithm has built nominal and adverbial LGT's. The valencies for a verb are filled first locally and, if a filler cannot be found, a search is made from the fronted LGT's. The resulting LGT is not well-formed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML