File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/j99-2003_intro.xml

Size: 17,343 bytes

Last Modified: 2025-10-06 14:06:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="J99-2003">
  <Title>Tree Adjoining Grammars in a Fragment of the Lambek Calculus</Title>
  <Section position="4" start_page="210" end_page="218" type="intro">
    <SectionTitle>
X
</SectionTitle>
    <Paragraph position="0"> ter labeled X, i.e., -I cannot be part of a tree. This condition is in no way an important X constraint, as a grammar may always be transformed to conform to the constraint by substituting a unique node X for the partial tree. However, our logical representation makes use of a trick based on such trees: we replace nodes marked adjoinable by such partial trees (there is no mark at all in our logical representation). We also suppose that the type of each tree is unambiguous: an initial tree has no leaf with the same label as the root node, an auxiliary tree has only one leaf with the same label as the root node.</Paragraph>
    <Paragraph position="1"> To conform with the literature, we will use a' to refer to an initial tree, fl to refer to an auxiliary tree, and &amp;quot;7 to refer to some derived tree. Examples of initial and auxiliary trees are given in Figure 1. Two TAGs are defined: G1 = ({S}, {a, b, c, d, e}, S, {al}, {ill}) ({ is the empty word) and G2 = ({S, VP, NP, N}, {the, man, walks}, S, {a2, c~3, a4}, 0).</Paragraph>
    <Paragraph position="2"> The substitution operation is defined as usual. A nonterminal leaf of a tree may be expanded with a tree whose root node has the same label. We follow a conventional notation: leaves that accept substitution are marked with a down arrow 4. This is not to be interpreted as a restriction on substitution, but only as a visual indication of what remains to be substituted to get a complete sentence. The adjunction operation is a little bit more complicated. It supposes a derived tree with a nonterminal node, say X, possibly internal and not marked NA, and an auxiliary tree with root node X.</Paragraph>
    <Paragraph position="3"> The operation consists in:  * finally, inserting the excised subtree at the foot node (hence labeled X and marked with a star ,) in the auxiliary tree.</Paragraph>
    <Paragraph position="4"> Examples of these operations are given in Figure 2. To clearly show the adjunction operation, the links of the adjoined tree fll are represented by dashed lines in the derived trees 73 and 74. Obviously, there is only one kind of link. We write 7a ~ 72 when 72 is the result of an adjunction or a substitution of an elementary tree of a TAG G on the derived tree 71; ~h is the reflexive, transitive closure of ~c. The set {7/3a E G and c~ ~ 7} is represented by T(G). The language L(G) generated by a TAG G is the set of strings, i.e., sequences of leaves of trees in T(G) when the leaves of these trees are only labeled with terminal nodes, and whose root is the start symbol. Hence, L(G1) = {anbncnd~/n &gt; 0} and L(G2) = {the man walks}.</Paragraph>
    <Paragraph position="5">  can be used. The first one (Figure 4) is a straightforward use of a Lambek-style parsing, given the two implications and a set of proper axioms corresponding to the words.</Paragraph>
    <Paragraph position="6"> The two other proofs do not use proper axioms at all: rules labeled lex are provable sequents; as these sequents are obviously provable we omit their proof tree. The second proof (Figure 5) is in the same spirit as the first. However, for this second proof, descriptions of lexical items are included in the sequents. At the same time, it can easily be compared to the third proof: in the second proof, the structural information is located at the head of each structure as one formula; in the third proof, one formula represents a syntactic tree of level 1. The third proof (Figure 6) interprets the Lambek grammar in a derivation style, we only need one implication o- and the connective times (r). The proofs use cuts: they can be withdrawn using the cut elimination theorem, but we think the cuts help in understanding the process. The following sections include other examples and emphasize the usefulness of noncommutative linear logic in the linguistic domain.</Paragraph>
    <Paragraph position="7"> A natural way to extend Lambek calculus consists in embedding it in a classical system, in the sense that the connectives &amp;quot;and&amp;quot; and &amp;quot;or&amp;quot; are dual. Indeed, LC is an &amp;quot;intuitionistic&amp;quot; system as there can be only one conclusion in the sequents, this is not  the case with noncommutative linear logic. Allowing multiple conclusions may give valuable benefits from a linguistic point of view, but we will only consider in this paper the geometrical representation available for such a system, i.e., proofnets. In the appendix, we give a brief description of linear logic, and the relations between classical linear, and noncommutative linear logics. We hope this will help readers to understand the overall framework.</Paragraph>
    <Paragraph position="8"> 4. The Calculus ,,4 (A Fragment of LC) The formalization of TAG in LC relies mainly on a logical presentation of the two operations substitution and adjunction, together with a correspondence between proofs and trees. As already shown in the previous section, the substitution operation is nothing but the application of the cut rule restricted to atomic formulas, which we call the atomic cut rule. Interpreting the adjunction operation is really the main difficulty. The adjunction results from two atomic cut rules between the sequent corresponding to the adjunction tree and two suitable sequents corresponding to two subparts of the  Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars</Paragraph>
    <Paragraph position="10"> So- NP O VP, NP o- John, John, VP o-- V ~ NP (r) N P, V o- gives,gives, NP c- Martj, Mary, Np c- Det (r) N, Det o-- a,a, N o- book, book t- S Figure 6 Proof of John gives Mary a book: one implication and times.</Paragraph>
    <Paragraph position="11"> This set of trees may be viewed as a subset of the closure T(G2) under substitution (possibly with the declaration of adjunction nodes) of the following set of trees of level 1: GrammarG2={I, ~/~, //~} a S d b SNA C Note that the result of the adjunction of the second tree of ~ on itself is exactly the result of substitutions on trees of G2. However it is obvious that trees resulting from substitution operations on G2 do not always correspond to results of adjunction operations on ~.</Paragraph>
    <Paragraph position="12"> We logically represent the set of trees T(~2) as (the set of provable theorems of) a calculus A(~2): the formulas are built with the alphabet {c, a, b, c, d, S} and the set of connectives {(r), o-}, the sequent calculus consists of the axioms s F- s and the rules (in both axioms and rules, s is a propositional letter):</Paragraph>
    <Paragraph position="14"> The introduction of a left implication (o--) corresponds to the building of a partial tree. Such introductions are then restricted either to the formalization of the trees of the grammar (the first three rules correspond exactly to the trees of ~2), or to the formalization of adjunction nodes (the formula s o-- s &amp;quot;marks&amp;quot; s as being an adjunction node, i.e., the adjunction rule may be applied only on this kind of node as it will be clear below).</Paragraph>
    <Paragraph position="15"> The grammar ~ can then be logically represented as a subset M(G~) of the set of provable sequents of the calculus A(G2): M(~) = {S o- a (r) S (r)d,a,S o-- S,S o-- bQ S Qc, b,S,c, dt- S,S o-- S,S o- ~,~ F- S} In AB-grammars (Bar-Hillel 1953), only one implication is used without any &amp;quot;and&amp;quot; connective. The grammar would be represented in AB-grammars as two provable sequents (note that &amp;quot;daughters of a node&amp;quot; are explicitly ordered): ((So-a) o--S)o--a,a, So-S,((So--c) o-S)o-b,b,S,c,a S, S o--S, S S We will prove later that, besides the cut rule, there exists another derived rule for the calculus A(~2) (and in fact for each calculus of this kind) mimicking the adjunction operation. Reducing the calculus, then, to a closure of the substitution and adjunction  Summary of the logical interpretation of the TAG formalism.</Paragraph>
    <Paragraph position="16"> rules on M(~), we get exactly the logical representations of the set of trees under the TAG grammar ~.</Paragraph>
    <Paragraph position="17"> The adjunction rule must be logically justified: there must be only one way to combine the pieces (i.e., provable sequents corresponding to trees of level 1), given the substitution node, such that the order of the elements is as requested.</Paragraph>
    <Paragraph position="18"> To prove this, we show that for a suitable fragment of LC there is a unique way to decompose a sequent P, a o-- A, A t- B in two sequents P, a, A 2 }- B and A1 }- A, where A = A1, A 2. In this section, we clarify the calculus A used to interpret TAG: it includes a cut rule and an adjunction rule that mimic the grammatical operations. As pointed out previously, these two rules are correct with respect to logic. We give the basic properties satisfied by this calculus A. In order to represent TAG in LC, we first construct the set ~ of subtrees of depth I of trees appearing in a TAG grammar G'. The TAG grammar G r is then a subset of the closure T(~) of the set ~ under substitution (indicated by subst) and the declaration of nodes where adjunction is not available (indicated by NA). The interpretation of elements of G as provable sequents of A is straightforward. This leads to a calculus A(~) where the operations are restricted with respect to ~. The TAG grammar ~r is then in correspondence with a subset M(~ ~) of A(G) and we prove the equivalence between the language CL(G') generated by G' and the set of sequents CL(M(~')) obtained by closure on M(G') by the cut and adjunction rules (we use M instead of M(G') whenever there is no ambiguity). Proofs of propositions are delayed to the appendix. The various components of our approach are summarized in Figure 7.</Paragraph>
    <Paragraph position="19"> Consider the following fragment A of LC: Definition The Calculus A Alphabet of ~4: propositional letters a, b ..... connectives (r), o--.</Paragraph>
    <Paragraph position="20"> Formulas: usual definition. A is a simple Q-formula iff A is a propositional letter or A is a formula bl @ .. * @ bn where bl ..... bn are propositional letters. B is a o--formula iff B = a o-- A where a is a propositional letter and A is a simple Q-formula.</Paragraph>
    <Paragraph position="21">  Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars Sequents: F ~- A, where F is a finite sequence of formulas and A is a formula.</Paragraph>
    <Paragraph position="22"> * Sequent calculus: m m Axiom: a ~- a Rules: F~-A Af-B ((r)) F, Af-A(r)B F t- A F1, C, F2 b B F1, C o- A,F, F2 ~- B (o-)  In the following, we only consider sequents such that formulas in the left side are either propositional letters, or o---formulas. So, in the rule introducing o-, C stands for a propositional letter. As we have only one propositional letter before o-, we model trees: C is the (unique) mother and the Q-formula A is the sequence of its daughters. Proposition Main properties of calculus .4 (proofs in the appendix)  1. If I ~ f- A @ B is provable in .4, then  A and B are simple (r)-formulas; there is a unique pair (F1, F2) s.t. F = F1, F2 and both the sequents F1 t- A and I~2 \[- B are provable in .4.</Paragraph>
    <Paragraph position="23">  2. If F, a o-- A, A ~- B is provable in .4, then .</Paragraph>
    <Paragraph position="24"> .</Paragraph>
    <Paragraph position="25"> * A and B are simple Q-formulas; * there is a unique pair (A1, A2) s.t. A = A1, A2 and both the  sequents A1 }- A and F, a, A2 F- B are provable in `4.</Paragraph>
    <Paragraph position="26"> Such a pair (A1, A2) will be called &amp;quot;the splitting pair in F, a o-- A, A f- B for A.&amp;quot; Note that this pair can be computed easily: the first element A1 of the splitting pair must satisfy a counting condition on each propositional letter occurring in it (see the appendix).</Paragraph>
    <Paragraph position="27"> The calculus .4 is closed under the atomic cut rule (which we also call the substitution rule) FF-a Al, a, A2 ~- A A1, P, A2 }- A (cut) i.e., if the sequents F }- a and A1, a, A2 ~- A are provable in .4, then the sequent A1, F, A2 f- A is also provable in .4.</Paragraph>
    <Paragraph position="28"> The calculus .4 is closed under the adjoining rule Pl, a, P2 F- a A, a o-- a,A ~- b A, F1, A1, F2, A2 F- b (adj) where (A1, A2) is the splitting pair of A in A, a o-- a, A t- b. Note that A1 and A2 are uniquely defined from the premises, so the previous deduction is really a logical rule.</Paragraph>
    <Paragraph position="29">  Computational Linguistics Volume 25, Number 2 the closure of ~ under the rules: substitution with or without the declaration of a new possibly internal point on which the adjoining operation may be performed, adjoining operation.</Paragraph>
    <Paragraph position="30"> A(G) is the calculus obtained from .4 as follows: propositional letters are exactly all the labels of the trees in ~, the rule (o-) is restricted as follows:</Paragraph>
    <Paragraph position="32"> where A, B are simple Q-formulas of the language of A(~), a is a propositional letter of the language of A(G) and one of the following  The following propositions state the correspondence between sequents and trees. The first two provide a precise translation between the two notions. Basically, a sequent I ~ F- a (in the previous language) is the logical equivalent of a tree with root a, and there is exactly one formula in I ~ for each leaf, for each subtree (of depth 1), for each adjunction node, and nothing else. SeqO (respectively, Tree()) associates a sequent (respectively, a tree) to each tree (respectivel3C/ each sequent), and we prove the two are converse. The last three propositions are properties concerning the logical counterpart of a TAG grammar. The last one is in fact the most important: the closure under (logical) adjunction and substitution of the set of sequents corresponding to a set of elementary trees is exactly the set of sequents corresponding to the closure under (grammatical) adjunction and substitution of this set of elementary trees. In other words, the logical calculus (the restricted logical calculus we defined above) and the grammatical calculus (the TAG calculus) coincide.</Paragraph>
    <Paragraph position="33"> Proposition Main properties of calculus `4(~) (proofs in the appendix) Properties 1-4 of `4 are also properties of A(G). Moreover the following properties hold for A(G): To T E T(~), we associate a sequent Seq(T) of `4(G) s.t.</Paragraph>
    <Paragraph position="34"> -- if a is the root of T, and the terminal points of T (ordered from left to right) are al ..... am, then Seq(T) is Ft-a  Abrusci, Fouquer6, and Vauzeilles Tree Adjoining Grammars where the sequence of all the propositional variables occurring in E is al .... , am and there is a formula c o-- c in E iff c is a possibly internal point of T on which the adjoining operation may be performed;  -- Seq(T) is provable in A(~).</Paragraph>
    <Paragraph position="35"> * To every provable sequent F }- A in .A(~), we associate Tree(E t- A) s.t. w if A is a propositional letter, then Tree(iv t- A) C T(~) where the root is A, the terminal points (from left to right) are exactly all  the propositional letters occurring in E and in the same order in which they occur in IV, and the possibly internal points on which the adjoining operation may be performed are exactly all the propositional letters c s.t. c o-- c occurs in F; if A is bl (r) ... (r) bn, and so F = F1 ..... Fn with the sequents Ivi F- bi provable in A(G) for every 1 &lt; i &lt; n, then Tree(E t- A) is a sequence T1 ..... Tn of trees E X-(G), s.t. Ti = Tree(Fi t- bi).  * If IV F a is provable in A(G), then Seq(Tree(iv ~- a)) = P 1- a. If T is a tree of G, then Tree(Seq(T)) = T.</Paragraph>
    <Paragraph position="36"> * Let M be a set of provable sequents in A(~). Define CL(M) as follows: MC_CL(M) -- (closure under atomic cut rule) if F F a C/ CL(M) and A1, a, A2 F B ECL(M), then A1, iV, A 2 F B cCL(M) -- (closure under adjoining operation) if El, a, IV2 }- a C/CL(M) and A, a o-- a, A F b ECL(M), then A, IV1, A1, IV2, A2 }- b cCL(M), where (A1, A2) is the splitting pair of A in A, a, A ~- b -- nothing else belongs to CL(M).</Paragraph>
    <Paragraph position="37"> * If IV F A cCL(M), then iV F A is provable in A(~).</Paragraph>
    <Paragraph position="38"> * If ~' c T(~, let CL(~') be the closure of G' under: -- substitution, -- adjoining operation.</Paragraph>
    <Paragraph position="40"> Starting from this last proposition, it is possible to prove that the language accepted by a TAG grammar ~t is exactly the language accepted by M(~'). We can define the language accepted by such a calculus as follows: Let us take only those sequents in CL(M(Gt)) whose right part is the propositional variable S (the start symbol of the grammar), and such that propositional variables of the left part of the sequent correspond to terminal symbols of the grammar, i.e., words of the language. The language accepted by M(~ ~) is then the set of sequences of words in the same order as they appear in the previous sequents.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML