XML Viewer - p99-1011

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/99/p99-1011_metho.xml
Size: 12,060 bytes
Last Modified: 2025-10-06 14:15:20
<?xml version="1.0" standalone="yes"?>
<Paper uid="P99-1011">
  <Title>A Meta-Level Grammar: Redefining Synchronous TAG for Translation and Paraphrase</Title>
  <Section position="5" start_page="82" end_page="82" type="metho">
    <SectionTitle>
3 S-TAG and Paraphrase
</SectionTitle>
    <Paragraph position="0"> Syntactic paraphrase can also be described with S-TAG (Dras, 1997; Dras, forthcoming). The manner of representing paraphrase in S-TAG is similar to the translation representation described in Section 2. The reason for illustrating both is that syntactic paraphrase, because of its structural complexity, is able to illuminate the nature of the problem with S-TAG. In a specific parallel, a difficulty like that of the clitics occurs here also, for example in paraphrases such  as (4).</Paragraph>
    <Paragraph position="1"> (4) a. The jacket which collected the dust was tweed.</Paragraph>
    <Paragraph position="2"> b. The jacket collected the dust. It was tweed.</Paragraph>
    <Paragraph position="3">  Tree pairs which could represent the elements in the mapping between (4a) and (4b) are given in Figure 6. It is clearly the case that the trees in the tree pair c~9 are not elementary trees, in the same way that on esp~re que is not represented by a single elementary tree: in both cases, such single elementary trees would violate the Condition on Elementary Tree Minimality (Frank, 1992). The tree pair a0 is the one that captures the syntactic rearrangement in this paraphrase; such a tree pair will be termed the STRUCTURAL MAPPING PAIR (SMP). Taking as a basic set of trees the XTAG standard grammar of English (XTAG, 1995), the derivation tree pair for (4) would be as in Figure 7. 3 Apart from c~9, each tree in Figure 6 corresponds to an elementary object-level tree, as indicated by its label; the remaining labels, indicated in bold in the metalevel' derivation tree in Figure 7, correspond to the elementary object-level trees forming (~9, in much the same way that on esp~re que is represented by a subderivation comprising an on tree substituted into an esp~re que tree.</Paragraph>
    <Paragraph position="4"> Note that the nodes corresponding to the left tree of the SMP form two discontinuous groups, but these discontinuous groups are clearly related. Dras (forthcoming) describes the conditions under which these discontinuous groupings are acceptable in paraphrase; these discontinuous groupings are treated as a single block with SLOTS connecting the groupings, whose fillers must be of particular types. Fundamentally, however, the structure is the same as for clitics: in one derivation tree the grouped elements are in one branch of the tree, and in the other they are in two separate branches with the possibility of an unbounded amount of intervening material, as described below in Section 4.</Paragraph>
  </Section>
  <Section position="6" start_page="82" end_page="84" type="metho">
    <SectionTitle>
4 Meta-Level Structure
</SectionTitle>
    <Paragraph position="0"> Example (5) illustrates why the paraphrase in (4) has the same difficulty as the clitic example in (3) when represented in S-TAG: because unbounded intervening material can occur when promoting arbitrarily deeply embedded relative clauses to sentence level, as indicated by Figure 8, an isomorphism is not possible between derivation trees representing paraphrases such as (4) and (5). Again, the component trees of the SMP are in bold in Figure 8.</Paragraph>
    <Paragraph position="1">  (5) a. The jacket which collected the dust which covered the floor was tweed.</Paragraph>
    <Paragraph position="2"> b. The jacket which collected the dust 3Node labels, the object-level tree names, are given  according to the XTAG standard: see Appendix B of XTAG (1995). This is done so that the component trees of the aggregate (~9 and their types are obvious. The lexical item to which each is bound is given in square brackets, to make the trees, and the correspondence between for example Figure 6 and Figure 7, clearer.</Paragraph>
    <Paragraph position="4"> The paraphrase in (4) and in Figures 6 and 7, and other paraphrase examples, strongly suggest that these more complex mappings are not an aberration that can be dealt with by patching measures such as bounded subderivation. It is clear that the meta level is fundamentally not just for establishing a one-to-one onto mapping between nodes; rather, it is also about defining structures representing, for example, the 4The referring expression that is the subject of this second sentence has changed from it in (4) to the dust so the antecedent is clear. Ensuring it is appropriately coreferent, by using two occurrences of the same diacritic in the same tree, necessitates a change in the properties of the formalism unrelated to the one discussed in this paper; see Dras (forthcoming). Assume, for the purpose of this example, that the referring expression is fixed and given, as is the case with it, rather than determined by coindexed diacritics.</Paragraph>
    <Paragraph position="5"> SMP at this meta level: in an isomorphism between trees in Figure 8, it is necessary to regard the SMP components of each tree as a unitary substructure and map them to each other.</Paragraph>
    <Paragraph position="6"> The discontinuous groupings should form these substructures regardless of intervening material, and this is suggestive of TAG's EDL.</Paragraph>
    <Paragraph position="7"> In the TAG definition, the derivation trees are context free (Weir, 1988), and can be expressed by a CFG. The isomorphism in the S-TAG definition of Shieber (1994) reflects this, by effectively adopting the single-level domain of locality (extended slightly in cases of bounded subderivation, but still effectively a single level), in the way that context free trees are fundamentally made from single level components and grown by concatenation of these single levels.</Paragraph>
    <Paragraph position="8"> This is what causes the isomorphism requirement to fail, the inability to express substructures at the meta level in order to map between them, rather than just mapping between (effec null tively) single nodes.</Paragraph>
    <Paragraph position="9"> To solve the problem with isomorphism, a meta-level grammar can be defined to specify the necessary substructures prior to mapping, with minimality conditions on what can be considered acceptable discontinuity. Specifically, in this case, a TAG meta-level grammar can be defined, rather than the implicit CFG, because this captures the EDL well. The TAG yield function of Weir (1988) can then be applied to these derivation trees to get derived trees. This, of course, raises questions about effects on generative capacity and other properties; these are dealt with in Section 5.</Paragraph>
    <Paragraph position="10"> A procedure for automatically constructing a TAG meta-grammar is as follows in Construction 1. The basic idea is that where the node bijection is still appropriate, the grammar retains its context free nature (by using single-level TAG trees composed by substitution, mimicking CFG tree concatenation), but where EDL is required, multi-level TAG initial trees are defined, with TAG auxiliary trees for describing the intervening material. These meta-level trees are then mapped appropriately; this corresponds to a bijection of nodes at the metameta level. For (5), the meta-level grammar for the left projection then looks as in Figure 9, and for the right projection as in Figure 10.</Paragraph>
    <Paragraph position="11"> * Figure 11 contains the meta-meta-level trees, the tree pair that is the derivation of the meta level, where the mapping is a bijection between nodes. Adding unbounded material would then just be reflected in the meta-meta-level as a list of/3 nodes depending from the j315/j31s nodes in these trees.</Paragraph>
    <Paragraph position="12"> The question may be asked, Why isn't it the case that the same effect will occur at the metameta level that required the meta-grammar in the first place, leading perhaps to an infinite (and useless) sequence? The intuition is that it is the meta-level, rather than anywhere 'higher', which is fundamentally the place to specify structure: the object level specifies the trees, and the meta level specifies the grouping or structure of these trees. Then the mapping takes place on these structures, rather than the object-level trees; hence the need for a grammar at the meta-level but not beyond.</Paragraph>
    <Paragraph position="13"> Construction 1 To build a TAG metagrammar: null  1. An initial tree in the metagrammar is  formed for each part of the derivation tree corresponding to the substructure representing an SMP, including the slots so that a contiguous tree is formed. Any node that links these parts of the derivation tree to other subtrees in the derivation tree is also included, and becomes a substitution node in the metagrammar tree.</Paragraph>
    <Paragraph position="14"> 2. Auxiliary trees are formed corresponding to the parts of the derivation trees that are slot fillers along with the nodes in the discontinuous regions adjacent to the slots; one contiguous auxiliary tree is formed for each bounded sequence of slot fillers within each substructure. These trees also satisfy certain minimality conditions.</Paragraph>
    <Paragraph position="15"> 3. The remaining metagrammar trees then come from splitting the derivation tree into single-level trees, with the nodes on  these single-level trees in the metagrammar marked for substitution if the corresponding nodes in the derivation tree have subtrees.</Paragraph>
    <Paragraph position="16"> The minimality conditions in Step 2 of Construction 1 are in keeping with the idea of minimality elsewhere in TAG (for example, Frank, 1992). The key condition is that meta-level auxiliary trees are rooted in c~-labelled nodes, and have only ~-labelled nodes along the spine.</Paragraph>
    <Paragraph position="17"> The intuition here is that slots (the nodes which meta-level auxiliary trees adjoin into) must be c~-labelled: fl-labelled trees would not need slots, as the substructure could instead be continuous and the j3-1abelled trees would just adjoin in. So the meta-level auxiliary trees are rooted in c~-labelled trees; but they have only ~-labelled trees in the spine, as they aim to represent the minimal amount of recursive material. Notwithstanding these conditions, the construction is quite straightforward.</Paragraph>
  </Section>
  <Section position="7" start_page="84" end_page="84" type="metho">
    <SectionTitle>
5 Generative Capacity
</SectionTitle>
    <Paragraph position="0"> Weir (1988) showed that there is an infinite progression of TAG-related formalisms, in generative capacity between CFGs and indexed grammars. A formalism ~-i in the progression is defined by applying the TAG yield function to a derivation tree defined by a grammar formalism  5~i_1; the generative capacity of ~i is a superset of ~'i-1- Thus using a TAG meta-grammar, as described in Section 4, would suggest that the generative capacity of the object-level formalism would necessarily have been increased over that of TAG.</Paragraph>
    <Paragraph position="1"> However, there is a regular form for TAGs (Rogers, 1994), such that the trees of TAGs in this regular form are local sets; that is, they are context free. The meta-level TAG built by Construction 1 with the appropriate conditions on slots is in this regular form. A proof of this is in Dras (forthcoming); a sketch is as follows. If adjunction may not occur along the spine of another auxiliary tree, the grammar is in regular form. This kind of adjunction does not occur under Construction 1 because all meta-level auxiliary trees are rooted in c~-labelled trees (object-level auxiliary trees), while their spines consist only of p-labelled trees (object-level initial trees).</Paragraph>
    <Paragraph position="2"> Since the meta-level grammar is context free, despite being expressed using a TAG grammar, this means that the object-level grammar is still 8{} a TAG.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML