File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/p04-1084_intro.xml
Size: 4,856 bytes
Last Modified: 2025-10-06 14:02:22
<?xml version="1.0" standalone="yes"?> <Paper uid="P04-1084"> <Title>Generalized Multitext Grammars</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Informal Description and Comparisons </SectionTitle> <Paragraph position="0"> GMTG is a generalization of MTG, which is itself a generalization of CFG to the synchronous case.</Paragraph> <Paragraph position="1"> Here we present MTG in a new notation that shows the relation to CFG more clearly. For example, the following MTG productions can generate the multitext [(I fed the cat), (ya kota kormil)]:1</Paragraph> <Paragraph position="3"> Each production in this example has two components, the first modeling English and the second (transliterated) Russian. Nonterminals with the same index must be rewritten together (synchronous rewriting). One strength of MTG, and thus also GMTG, is shown in Productions (5) and (6). There is a determiner in English, but not in Russian, so Production (5) does not have the nonterminal D in the Russian component and (6) applies only to the English component (independent rewriting). Formalisms that do not allow independent rewriting require a corresponding a17 to appear in the second component on the right-hand side (RHS) of Production (5), and this a17 would eventually generate the empty string. This approach has the disadvantage that it introduces spurious ambiguity about the position of the &quot;empty&quot; nonterminal with respect to the other nonterminals in its component. Spurious ambiguity leads to wasted effort during parsing.</Paragraph> <Paragraph position="4"> GMTG's implementation of independent rewriting through the empty tuple () serves a very different function from the empty string. Consider the following GMTG:</Paragraph> <Paragraph position="6"> Production (8) asserts that symbol a20 vanishes in translation. Its application removes both of the non-terminals on the left-hand side (LHS), pre-empting any other production. In contrast, Production (9) 1We write production components both side by side and one above another to save space, but each component is always in parentheses.</Paragraph> <Paragraph position="7"> explicitly relaxes the synchronization constraint, so that the two components can be rewritten independently. The other six productions make assertions about only one component and are agnostic about the other component. Incidentally, generating the same language with only fully synchronized productions would raise the number of required productions to 11, so independent rewriting also helps to reduce grammar size.</Paragraph> <Paragraph position="8"> Independent rewriting is also useful for modeling paraphrasing. Take, for example, [(Tim got a pink slip), (Tim got laid off )]. While the two sentences have the same meaning, the objects of their verb phrases are structured very differently. GMTG can express their relationships as follows:</Paragraph> <Paragraph position="10"> As described by Melamed (2003), MTG requires production components to be contiguous, except after binarization. GMTG removes this restriction.</Paragraph> <Paragraph position="11"> Take, for example, the sentence pair [(The doctor treats his teeth), (El m'edico le examino los dientes)] (Dras and Bleam, 2000). The Spanish clitic le and the NP los dientes should both be paired with the English NP his teeth, giving rise to a discontinuous constituent in the Spanish component. A GMTG fragment for the sentence is shown below: Note the discontinuity between le and los dientes. Such discontinuities are marked by commas on both the LHS and the RHS of the relevant component.</Paragraph> <Paragraph position="12"> GMTG's flexibility allows it to deal with many complex syntactic phenomena. For example, Becker et al. (1991) point out that TAG does not have the generative capacity to model certain kinds of scrambling in German, when the so-called &quot;cooccurrence constraint&quot; is imposed, requiring the derivational pairing between verbs and their complements. They examine the English/German sentence fragment [(... that the detective has promised the client to indict the suspect of the crime), (...</Paragraph> <Paragraph position="13"> dass des Verbrechens der Detektiv den Verd&quot;achtigen dem Klienten zu &quot;uberf&quot;uhren versprochen hat)]. The verbs versprochen and &quot;uberf&quot;uhren both have two noun phrases as arguments. In German, these noun phrases can appear to the left of the verbs in any order. The following is a GMTG fragment for the above sentence pair2:</Paragraph> <Paragraph position="15"> The discontinuities allow the noun arguments of versprochen to be placed in any order with the noun arguments of &quot;uberf&quot;uhren. Rambow (1995) gives a similar analysis.</Paragraph> </Section> class="xml-element"></Paper>