File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-2094_intro.xml
Size: 11,775 bytes
Last Modified: 2025-10-06 14:05:10
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2094"> <Title>Parameterization of the Interlingua in Machine Translation</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 2 Toward a Catalog of Syntactic Divergences </SectionTitle> <Paragraph position="0"> Figure 2 shows a diagram of the UNITItAN syntactic processing component. The parser of this component provides a source-language syntactic structure to the lexical-semantic processor, and, after lexical-semantic processing is completed, the generator of this component provides a target-language syntactic structure. Both the parser and generator of this component have access to the syntactic principles of GB theory. These principles, which act as constraints (i.e., filters) on the syntactic structures pro- null duced hy the parser and the generator, operate on tim basis of parameter settings that supply certain lauguage-specific iulbrmation; this is where syntactic divergences are factored out from the lexical-semantic representation.</Paragraph> <Paragraph position="1"> The Gll principles and parameters are organized into modules whtme constraints are applied in the following order: (1) X, (2) Boundiug, (3) Case, (4) 'iYace, (5) Ilinding, and (6) 0. A detailed descriw tiou of these modules is provided in Dorr (1987).</Paragraph> <Paragraph position="2"> We will look t, riefiy at a number of these, /hensing on how syntactic divergences are accounted for by this approach. Figure 3 smmnarizes the syntactic divergences that are revealed by the parametric variations presented here.l</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Principles and Parameters of the X Modal(,&quot; </SectionTitle> <Paragraph position="0"> The X&quot; constraiut module of the syntactic component provides the phrase-structure representation of sentenees. In particular, the fundamental principle of the X module is that each phrase of a sentence has a mazimal projection, X-MAX, lor a head of category X (see tigure 4). ~ In addition to the head X, a phrasal projection potentially contaius satellites c~1, a~, ill, f12, 71, and 72, where cq attd ~2 are any nulnber of maximally adjoined adjuncts positioned accurding to the adjuaclion parameter, fll aud f12 are arguments (subjects aud objects) ordered according to the constituent order parameter, and 71 and 72 are any number of minimally adjoined adjuncts p~ sitioued according to the adjunctiou parameter. 3 tThe syntactic divergences are enumerated with r~ spect to the relevant pasametera and modules of the syntactic component. The figure illustrates the effect of syntactic parameter settings on tile constituent structure for each language. (In this figure, E stands for English, G for German, S for Spanish, and I for Icelandic.) aThe possibilities for the category X are: (V)erb, (N)oua, (A)djective, (P)reptmition, (C)omplementizer, and (1)affection. 't'ite Complementizer corresponds to relative pronouns such as that in the matt that I saw.</Paragraph> <Paragraph position="1"> The IntlectionM category corresponds to modals such as would in 1 would eat cttke.</Paragraph> <Paragraph position="2"> 3This is a revised version of the &quot;X-Theory presented in Chomsky (1981). Tire adjunction par~ueter will not be discussed here, but see Dorr (1987) for details.</Paragraph> <Paragraph position="3"> ACrEs DE COTING-92, NANTES, 23-28 ^O~&quot; 1992 6 2 S Paoc. OF COL1NG-92, NANYV.S, AUO. 23-28, 1992 Syntactic Divergence Examples Parameter GB Module E, S: V preccdC/~ object constituent X G: V followe object order E: P stranding allowed proper Gov~t S, G: No P stranding allowed governors E, G: Fronted question word bounding Bounding beyond |ingle sentence nodes level not allowed S: Fronted quenion word beyond single sentence level allowed E, G: P not C/~quired before type of verbal object anaoci- government ated with elitic S: p required before verbal object a~aociated with elitic E, G: Subject required in ms- null nub- trlx claule ject S: Subject not required in matrix clau~ E, S, G: Anaphor (e.g., him. governing Binding self) must have an- category tecedent inside near- eta dominating clauBe Anuphor (e,g. , siq) I: may have antecedent outside nearest domi- nating clause E: No empty pleonastics NDP 0 allowed S: Empty pleonaatica al- lowed G: Empty pleonastics in embedded claunes only Given this general i phrase-structure representation, we can now &quot;fit&quot; this template onto the phrase structure of each language by providing the appropriate settings for the parameters of the X module. For example, the constituent order parameter characterizes the word order distinctions among English, Spanish and German. Unlike English and Spanish, German is assumed to be a subject-object-verb language that adheres to tim verb-second requirement in matrix clauses (see Safir (1985)). Thus, for the sentence 1 have seen him, we have the following contrusting argument structures: (2) (i) I have seen him (ii) Yo he visto a dl 'I have seen (to him)' (iii) Ich habe ihn gesehen 'I have him seen' The X module builds the phrase-structure from the general scheme of figure 4 and the parameter settings described above. The principles and parameters of the remaining modules are then applied as constraints to the phrase-structure representation. We will now examine each of the remaining modules in turn.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Principles and Parameters of the Government Module </SectionTitle> <Paragraph position="0"> Government Theory is a central notion to the Case and Trace modules. A familiar example of the government principle in English is that a verb governs its object. 4 We will examine the effect of this module in sections 2.4 and 2.5.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Principles and Parameters of the Bounding Module </SectionTitle> <Paragraph position="0"> The Bounding module is concerned with the distance between pairs of co-referring elements (e.g., traceantecedent pairs). The fundamental principle of the bounding module is that the distance between co-referring elements is not allowed to be more than one bounding node apart, where the choice of bounding nodes is allowed to vary across languages.</Paragraph> <Paragraph position="1"> The bounding nodes parameter setting accounts for a syntactic divergence between Spanish and En- null glish (and German): (3) (i)* Whol did you wonder whether ti went to school? ~ (ii) LQui6n, crees tfi queti rue a la esenela? The reason (3)0) is ruled out is that the word who has moved beyond two bounding nodes. It turns out that the corresponding Spanish sentence (3)(ii) is well-formed since the choice of bounding nodes is different and only one bounding node is crossed. 2,4 Principles and Paranaeters of the Case Module The Case module is in charge of ensuring that all noun phrases are properly assigned abstract case (e.g., nominative, objective, etc.). The Case Filter rules out any sentence that contains a non-casemarked noun phrase.</Paragraph> <Paragraph position="2"> The notion of government is relevant to case assignment since an element assigns case only if it is a governing case-assigner. Tile setting of the type of government parameter for English, Spanish, and German characterizes the following divergences: government principle.</Paragraph> <Paragraph position="3"> sit who is spoken emphatically, this sentence can almost be understood as an echo question corresponding to the statement I wondered whether John went to school. AcrEs DE COLING-92, NANteS, 23-28 ^O~&quot; 1992 6 2 6 PROC. OF COLING-92, NANTES, AUG. 23-28, 1992</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.5 Principles and Parameters of the Trace Module </SectionTitle> <Paragraph position="0"> After case has been assigned, the Trace module applies the empty category principle (ECP) which checks for proper government of empty elements.</Paragraph> <Paragraph position="1"> The ECP is parameterized by means of the null sub-ject parameter. As discussed in section 1, the null subject parameter accounts for the null subject distinction between Spanish, on the one hand, and F,n null glish and German on the other: (5) (i) Vo vi ellibro Vi el libro (ii) I saw the book * Saw the book (iii) Ich salt das Buch * Sah das Buch Art additional parameter that is relevant to the Trace module is the proper governors parameter. The choice of proper governor accounts for preposition-stranding distinctions in the three languages: null (t;) (i) \[mMxx What store\]i did John go to ti? r (fi)* \[N.IdaX Cu~I tienda\]i rue Juan a ti? (iii)* \[mMAX Welchem Geseha.ft\]i geht Johann zu ti? 2.6 Principles and Parameters of the</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Binding Module </SectionTitle> <Paragraph position="0"> The Binding module is tire final module applied before thematic roles are assigned. This module is concerned with the coreference relations among noun phrases, and it is dependent on the governing category parameter, which specifies that a governing category for a syntactic constituent is (roughly) the nearest dominating clause that has a subject. This parameter happens to have the same setting for English, Spanish, and German, but see Dorr (1987) for a description of other settings of this parameter (e.g., for Icelandic) based on work by Wexler & Manzini (1986),</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.7 Principles and Parameters of the 0 Module </SectionTitle> <Paragraph position="0"> The 0 module provides the interface between the syntactic component and the lexical-aemantic component. In particular, the assignment of themalic roles (henceforth 0-roles) after parsing leads into the construction of the interlingual form.</Paragraph> <Paragraph position="1"> The fundamental principle of the 0 module is the O-Criterion which states that a lexical head must eAs noted in Jaeggli (1981), animate objects (e.g., Guille) are a~ociated with a clitic pronoun (e.9., Io) only in certain dialects such as that of the River Plate area of South America.</Paragraph> <Paragraph position="2"> 7The t~ constituent is a trace that corresponds to the noun phrase that has been moved to the front of the sentence.</Paragraph> </Section> <Section position="7" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Ilent </SectionTitle> <Paragraph position="0"> assign 0-roles in u unique one-to-one correspondence with the argument positions specified in thc lexical entry for the head. One of the parameters ax~ociated with the 0 \[nodule is the unto-drop paradigm (NDP) parameter (based on work by Safir (1985)).</Paragraph> <Paragraph position="1"> This parameter accounts for the distinction between English, on the one hand, and Spanish and German, on the other hand, with respect to the subject of an embedded clause: (7) (i) * 1 know that was dancing (ii) Yo sd que hahfa un halle '1 know that (there) was a dance' (iii) Ich weill, daft getanzt wurde 'I know that (there) wa~ dancing' Ones all 0-roles are assigned, the lexical-semantic component of the translator composes the interlingual representation for the source and target language. The next section will describe the lexical-semantic component, and it will show how this com~ l)onent accounts for a number of divergences outside of the reahn of syntax.</Paragraph> </Section> </Section> class="xml-element"></Paper>