File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/c92-2094_metho.xml
Size: 10,218 bytes
Last Modified: 2025-10-06 14:13:03
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2094"> <Title>Parameterization of the Interlingua in Machine Translation</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Toward a Catalog of LexicaI-Semantic Divergences </SectionTitle> <Paragraph position="0"> Figure 5 shows a diagram of the UNITRAN lexical-semantic processing component. A detailed description of the lexical conccplual structure (LCS) which serves as the interlingua is not given here, but see Dorr (1990b) for further discussion, s 81n general, the LCS representation follows the format proposed by Jackeudoff (1983, 1990) which views semautic representation as a subset of conceptual struc- ture. Jackeudoff's approach includes such notious as Event and State, which are specialized into primitives such as (30, STAY, BE~ GO-EXT, aud ORIENT. As an example of how the primitive GO is used to represent sentence semantics, consider the following sentence: (s) (i) The ball rolled toward Beth.</Paragraph> <Paragraph position="1"> (ii) \]Event GO (\[Thing BALI.l, like: I like Mary Thematic S: gustar: Me gusts (:INT, :EXT) Maria G: 9efallen: Marie gefKIIt mir E: be: I am hungry Categorial S: tener: Yo tengo (:CAT) hambre G; hubert: lch habe Hunger E: like: 1 like eating Demotional S: gustar: Me gusta (:DEMOTE) comer G: gem: Ich ~ gem E: usually: John usu- Promotional ally goea home (:PROMOTE) S: soler: Juan auele ir a cMa G: gewJhnlich: Johann geht gewShnlich nach Hauae E: stab: I stabbed John Conflatioaal S: dar: Yo le di (:CONFLATED) pufialadaa a Juan G: erJtechen: lch er- atach Johann What is important to recognize about tiffs proceasing component is that, just as the syntactic component relies on parameterization to account for source-to-target divergences, so does the lexical-semantic component. The parameterization of this component is specified by means of language-specific lexical override markers associated with the LCS mapping betweeu the syntactic structure and the interlingua. null We will look briefly at the principles and parameters of the lexical-semantic component, focusing on how a number of divergences are accounted for by this approach. Figure 6 summarizes the lexical-semantic divergences that are revealed by the parametric variations presented here. 9 the primitives CAUSE and LET. A third dimension is introduced through the notion of field. This dimension extends the semantic coverage of spatially oriented primitives to other domains such as Posssssional, Temporal, Identificational, Circumstantial, and Existeutial. 9The divergences are enumerated with respect to the relevant principles and parameters of the lexical-semantic component. In contrast to the summary of syntactic divergences in figure 3, which enumerates the effect of syntactic pixameter settings on constituent structure, the list of divergences presented here is specified in terms of the effect of LCS parameter settings on the realization of specific lexical items.</Paragraph> <Paragraph position="3"> i. Syntactic specifier (/gh C B1 0 D2) O l,ogical subject (filL) 2. Syntactic complements ()ill O~2 - P~) O Logical arguments (B~U B~ - fl~.) S. Syntactic adjuncts (al O a 2 o &quot;rl O &quot;/2) O Logical Modifiers (a~ U a~ U ~i u 7~) 4. Syn~tic head (X) O Logical head (X I)</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> and Syntactic Category 3.1 Principles and Parameters of the Lexical-Semantic Component </SectionTitle> <Paragraph position="0"> The algorithm for mapping between the syntactic structure and the interlingua relies on the output of #-role assigmnent (in the analysis direction) and feeds into 0-role assignment (in the synthesis direction). Tile 0-roles represent positions in the LCS representations of lexical entries associated with the input words. Thus, the construction of the interlingun is essentially a unification process that is guided by the pointers left behind by 0-role assignment.</Paragraph> <Paragraph position="1"> The mapping, or linking rule between the syntactic positions and the positions of the LCS representation is shown in figure 7. In terms of 0-role assignment, the phrasal head X assigns #-roles corresponding to positions in the LCS associated with X j. For example, the syntactic subject Bk is assigned the logical subject position fl~ in the LCS. Once all these roles have been assigned, the interlingual representation is composed simply by reenrsively filling the arguments of tile predicate into their assigned LCS positions.</Paragraph> <Paragraph position="2"> ACRES DE COLING-92, NANTES, 23-28 AOtYr 1992 6 2 8 PROC. OF COLING-92, NANTES, AUO. 23-28, 1992 Ill addition to tile LCS linking rule, there is another general rule associated with tile lexical= semantic component: the canonical syntaclie represchist\]on (CSW,.) function. This fmtction associates an LCS type (e.g., TIIIIII3) with a syntactic category (e.g., N-MAX) (see figure 8).</Paragraph> <Paragraph position="3"> The LCS Linking rule and the CS~ function are the two fundmnental principles of the lexical= semantic component. In order to account for lexical-semantic divergenc~, these principles nmst be parameter\]zeal. In general, translation divergences occur when there is all exception to one (or both) of these principles in one language, but not in the other. Titus, the lexical entries have bccn constructed to support parametric variation that ac counts for such exceptions. The parameters are used in lexical entries as overrides for tile LCS linking rule and (JS~ function. We will now examine examples of how each parameter is used.</Paragraph> <Paragraph position="4"> The '*' parameter refers to an LG'S position that is syntactically realizable in the surfitce sentence. This parameter accounts for sSructural divergence: (9) (i) John entered the house (ii) Juan entr6 en la casa 'John entered (into) the house' Here, the Spanish sentence diverges structurally from the English sentence since the noun phrase (the house) is realized as a prepositional phrase (en la cuss). In order to account for this divergence, the lexicon uses tile * marker ill the LCS representation associated with the lexical entries for enter and entrnr. This marker specifies tim pbrasal level at whictl an argument will be projected: in tile Spanish lexical entry, the marker is associated with all LCS position that is realized at a syntactically higher phrasal level than that of tile English lexical entry.</Paragraph> <Paragraph position="5"> The :INT and :EXT paraineters allow tile I,CS linking rule to be overridden by associating a logical subject with a syntactic complement aud a logical argument with a syntactic subject. A t)o~iblc effect of using these parameter settings is that there is a subject-object reversal during translation. Such a reversal is called a thematic divergcuee: (10) (i) I like Mary (ii) Me gnats Maria 'Mary pleases me' tlere, the subject of the source-language sentence, I, is translated into all object position, and the object of the source-language sentence Maria is translated into a subject position. Ill order to accouut for this divergence, the lexicon uses the :INT and :EXT markers in the LCS representation associated with the lexieal entries for gustar. The English lexieal entry does not contain thesc markers since tile LCS linking rule does not need to be overridden in this case.</Paragraph> <Paragraph position="6"> The :(~AT lllarker provides a syntactic category for all LCS argument. Recall that the CS'K function maps all LCS type to a syntactic category (see figure 8). When this mapping is to bc overridden by a lexicaI entry, the language-specific marker :CAT is used, This parameter accounts for categorial divergence: (11) (i) 1 am hungry (ii) ich hahe Hunger 'l have hsnger ~ llere, not only are tl~e predicates be and hubert lexically distinct, but the arguments of these two predicates are categorially divergent: ill English, the urgmnent is all adjectival phrase, and, ill German, the argument is a noun phrmse. C/~'bc :CAT marker is used in the Gernmn definition to force the PROP-EWFY al'glnln~nt tO be realized as a norm rather than an adjective. Thus, the (2S~ function is overridden daring realization of tile word Hunger in this exampie. null</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3.1.4 :DEMOTE and :PROMOTE ~lar~tln(~t \[Jr s </SectionTitle> <Paragraph position="0"> The :I)EMOTE and :PROMOTE markers, like the :INT and :EXT markers, allow the LCS linking rule to be overridden by iL~sociating a logical head with a syntactic adjunct or complement. These lla t'ameters account, respectively, for demoiioual diver- null and promotional divergence: (13) (i) John usually goes home (ii) Juan sselc ir a I:zLsa ~Johlt teltds to go home' in the first case, thc English main verb like corresponds to tile adjunct geru in German, and the embedded verb eat corresponds to the main verb essen in German. Ill the second case, tile English adjunct usually corresponds to the main verb soler in Spanish. 'Fhese &quot;head switching&quot; divergences are acconnnodated analogously: the :\])EMOTE marker is used in the lexical entry tot ger~t and the :PRO-MO'l'l,; ~.vker is used in the lexical entry for soler.</Paragraph> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 3.1.5 :C.ONFLATED Parameter </SectionTitle> <Paragraph position="0"> The sixth LCS parameter is tile :GONI&quot;LATED marker. This marker is used tbr indicating that a particular argument need not be realized in tile surhtcc representation. This parameter accounts for couflational divergence aa in the sentence I stabbed argument that is incorporated in the English sentence is the \[IIIFE-I/0tlND argument since the verb stab does not realize this argument; by contrast, the Spanish construction dar pn~ialadas a explicitly realizes this argument as the word pnCLaladas. Thns, the :CONFLATED marker is associated with the I~IiIFE-WOUI~ argument in the case of stab, but not in the ease of dar.</Paragraph> </Section> class="xml-element"></Paper>