File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/p00-1058_intro.xml

Size: 4,954 bytes

Last Modified: 2025-10-06 14:00:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="P00-1058">
  <Title>Statistical parsing with an automatically-extracted tree adjoining grammar</Title>
  <Section position="4" start_page="0" end_page="3" type="intro">
    <SectionTitle>
2 The formalism
</SectionTitle>
    <Paragraph position="0"> The formalism we use is a variant of lexicalized tree-insertion grammar #28LTIG#29, whichis in turn a restriction of LTAG #28Schabes and Waters, 1995#29. In this variant there are three kinds of elementary tree: initial, #28predicative#29 auxiliary, and modi#0Cer, and three composition operations: substitution, adjunction, and sister-adjunction.</Paragraph>
    <Paragraph position="1"> Auxiliary trees and adjunction are restricted as in TIG: essentially, no wrapping adjunction or anything equivalent to wrapping adjunction is allowed. Sister-adjunction is not an operation found in standard de#0Cnitions of TAG, but is borrowed from D-Tree Grammar #28Rambow et al., 1995#29. In sister-adjunction the root of a modi#0Cer tree is added as a new daughter to any other node. #28Note that as it stands sister-adjunction is completely unconstrained; it will be constrained by the probability model.#29 Weintroduce this operation simply so we can derive the #0Dat structures found in the Penn Treebank. Following #28Schabes and Shieber, 1994#29, multiple modi#0Cer trees can be sister-adjoined at a single site, but only one auxiliary tree may be adjoined at a single node.</Paragraph>
    <Paragraph position="2"> Figure 1 shows an example grammar and the derivation of the sentence #5CJohn should leave tomorrow.&amp;quot; The derivation tree encodes this process, with each arc corresponding to a composition operation. Arcs corresponding to substitution and adjunction are labeled with the Gorn address  of the substitution or ad- null A Gorn address is a list of integers: the root of a tree has address #0F, and the jth child of the node with junction site. An arc corresponding to the sister-adjunctionof a tree between the ith and i + 1th children of #11 #28allowing for two imaginary children beyond the leftmost and right-most children#29 is labeled #11;i.</Paragraph>
    <Paragraph position="3"> This grammar, as well as the grammar used by the parser, is lexicalized in the sense that every elementary tree has exactly one terminal node, its lexical anchor.</Paragraph>
    <Paragraph position="4"> Since sister-adjunction can be simulated by ordinary adjunction, this variant is, like TIG #28and CFG#29, weakly context-free and O#28n  #29-time parsable. Rather than coin a new acronym for this particular variant, we will simply refer to it as #5CTAG&amp;quot; and trust that no confusion will arise.</Paragraph>
    <Paragraph position="5"> The parameters of a probabilistic TAG</Paragraph>
    <Paragraph position="7"> where #0B ranges over initial trees, #0C over auxiliary trees, #0D over modi#0Cer trees, and #11 over nodes. P  #28NONE j #11#29 is the probability of nothing adjoining at #11. #28Carroll and Weir, 1997#29 suggest other parameterizations worth exploring as well.</Paragraph>
    <Paragraph position="8"> Ourvariant adds another set of parameters:</Paragraph>
    <Paragraph position="10"> This is the probability of sister-adjoining #0D between the ith and i + 1th children of #11 #28as before, allowing for two imaginary children beyond the leftmost and rightmost children#29.</Paragraph>
    <Paragraph position="11"> Since multiplemodi#0Cer trees can adjoin at the same location, P sa #28#0D#29 is also conditioned on a #0Dag f which indicates whether #0D is the #0Crst modi#0Cer tree #28i.e., the one closest to the head#29 to adjoin at that location.</Paragraph>
    <Paragraph position="12"> The probability of a derivation can then be expressed as a product of the probabilities of address i has address i #01 j.</Paragraph>
    <Paragraph position="13"> the individual operations of the derivation. Thus the probability of the example derivation of Figure 1 would be</Paragraph>
    <Paragraph position="15"> where #0B#28i#29 is the node of #0B with address i.</Paragraph>
    <Paragraph position="16"> We want to obtain a maximum-likelihood estimate of these parameters, but cannot estimate them directly from the Treebank, because the sample space of PTAG is the space of TAG derivations, not the derived trees that are found in the Treebank. One approach, taken in #28Hwa, 1998#29, is to choose some grammar general enough to parse the whole corpus and obtain a maximum-likelihoodestimate by EM. Another approach, taken in #28Magerman, 1995#29 and others for lexicalized PCFGs and #28Neumann, 1998; Xia, 1999; Chen and Vijay-Shanker, 2000#29 for LTAGs, is to use heuristics to reconstruct the derivations,and directlyestimate the PTAG parameters from the reconstructed derivations. We take this approach as well. #28One could imagine combining the two approaches, using heuristics to extract a grammar but EM to estimate its parameters.#29</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML