File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/94/c94-1001_metho.xml

Size: 20,938 bytes

Last Modified: 2025-10-06 14:13:35

<?xml version="1.0" standalone="yes"?>
<Paper uid="C94-1001">
  <Title>Machine Translation IMPROVEMENT IN CUSTOMIZABII,ITY USING TRANSI~ATION TEMPLATES</Title>
  <Section position="3" start_page="0" end_page="26" type="metho">
    <SectionTitle>
2. Machine Translation Using Translation
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Templates
2.1 Aim of Translation Templates
</SectionTitle>
      <Paragraph position="0"> If a user wants publication-quality translations, stylistic wcll-fonnedness is as important as semantic invariance. Consider translating the Japanese sentence (1). Although its Iranslation (2), which is the result of our current MT system, is correct, (3) sounds more natural than (2); in (.3), the verb phrase &amp;quot;using these detectors&amp;quot; is nominalized to function as a subject to represent tile cause of the 'reduce' event. If tire user  prefelx (3) to (2) as a translation of (1), (2) needs to be post-edited.</Paragraph>
      <Paragraph position="1"> (1) korera-no kenshutsuki-wo tsukau kotoniyori these detectors-OP;J use by kakaku-ga teigen-shita price-SUBJ reduce-PAST (2) The price dropped by using these deteetm,'s. (3) Use of these detectors reduced the price.</Paragraph>
      <Paragraph position="2">  As the above example illustrates, when source and target languages have a significant difference in their linguistic features, linguistic structures of source sentences are drastically changed to generate natural translations. In this paper, we will call translation which requires complex structural changes 'complex trallslalion.' This type of knowledge is stored in all MT systems, but hlsuJ'ficiently. Therefore, a framework for ct, slomizing complex translation should be incorporated into the system. For this purlmse, we have introduced a framework which uses 'translation templates' to represent such knowledge.</Paragraph>
      <Paragraph position="3"> Using translation templates, a user can customize his MT system to deal with complex translation without any knowledge on the system's transkltion process because translation templates are created once the user specifics corresponding expressions in a source * \] . sentence and its expected trans\]atlon.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="26" type="sub_section">
      <SectionTitle>
2.2 Translation Templates
</SectionTitle>
      <Paragraph position="0"> A 'translation template' contains at least a pair of patterns, namely 'source' and 'target' patterns, each of which consists of 'constants' and 'variables.' A source pattern (SP) is a template to be compared with a source sentence, while a target pattern (TP) is used to generate a target sentence.</Paragraph>
      <Paragraph position="1"> Several reports on machine translation using translation templates suggest that they are useful for translating fixed expressions\[4\]\[7\]\[8\]. Our translation template is more expressive in the following points: * More parts of speech can be specified for variables.</Paragraph>
      <Paragraph position="2"> * Conditions on translating expressions matched with variables can be specified.</Paragraph>
      <Paragraph position="3"> These points will be explained below.</Paragraph>
      <Paragraph position="4"> Fig. 1 shows an example of a translatkm template. '$1' and '$2', which appear in both the source and target patterns, are variables, and the remaining elements are constants. All constants in the source pattern should appear in a source sentence in the same order. Strings which match with variables should satisfy parts of speech designated in the 'source condition.' In this example, the strings should be analyzed as 'rip' (noun phrases).</Paragraph>
      <Paragraph position="5"> The 'part of speech(POS)' of a template represents a syntactic category of a string matched with a source pattern. Currently, 'sentence' and 'sentence modifier' can be specified.</Paragraph>
      <Paragraph position="6"> The 'source condition(SCND)' represents conditions on variables in the 'source pattern.' The grammatical categories of variables currently in use are noun, noun phrase, number, clause and verb phrase. A string matched with a variable should be parsed as the specified category.</Paragraph>
      <Paragraph position="7"> The 'target condition(TCND)' represents conditions on variables in tile 'target pattern.' Two types are available: 'attribute' and 'relation.' Attributes specify information on one variable. For example, variables for nouns can be specified as having a 'default article' and a 'default number' to be used if there are no explicit clues to determine the article and the number. Similarly, the form of verb phrases in generation can be specified as 'to-infinitive' or 'gerund.' Relations represent the number agreements between a subject and a verb in the target pattern, for example.</Paragraph>
      <Paragraph position="8"> Variables may appear only in tile source or target pattern. Variables which appear only in the source pattern are used to represent expressions which have relations with another variable but disappear in the target sentence. Variables which appear only in the target pattern are used to represent a target word which is inflected by tile number agreement with tile</Paragraph>
      <Paragraph position="10"> contents of other variables.</Paragraph>
      <Paragraph position="11"> Fig. 2 shows other examples of translation templates. Fig. 2(a) shows a template which has a variable for a verb phrase. This template is created by referring to sentence (4) and its model translation (5) The target condition specifies that a verb phrase to be matched with the wniable '$2' is generated as a gerund.</Paragraph>
      <Paragraph position="12">  (4) jokyoshuuhasuu-no settei-wa, 'frequency to be eliminated'-of setting-TOP torimakondensa-de C-no atai-wo trimmer capacitor-lNST C-of value-OBJ tyousei-suru kotoniyori okonaeru.</Paragraph>
      <Paragraph position="13"> adjust by can be done (5) The frequency to be eliminated can be set by  adjusting tile value of C by a trimmer capacitor. The introduction of variables which match with verb phrases improves the flexibility of translation templates. Without these variables, we must create restricted source patterns, in which the word order of postpositional phrases like &amp;quot;-de&amp;quot; and &amp;quot;-wo&amp;quot; is fixed. Fig. 2(b) shows a template which has a variable appearing only in the target pattern. Tiffs template is created by referring to sentences (6) and (7) below. The target word (tw) of variable '$3' is specified as 'be' and its surface form is determined according to the 'numbe,&amp;quot; feature of the exp~ession of variable '$1 '</Paragraph>
    </Section>
    <Section position="3" start_page="26" end_page="26" type="sub_section">
      <SectionTitle>
2.3 Translation Process
</SectionTitle>
      <Paragraph position="0"> Fig. 3 shows a conceptual flow of translation process using translation templates. (The actual implementation is different from the flow.) Fil.'st, the 'translation template dictionary' is searched for applicable templates. If no applicable template is found, the source sentence is translated using the conventional translation module; if found, strings matched with variables are parsed and translated.</Paragraph>
      <Paragraph position="1"> Finally, translations of variables are embedded into the target pattern.</Paragraph>
      <Paragraph position="2"> This process is implemented in the conventiom, I translation module of our transfer-based MT system\[3\].</Paragraph>
      <Paragraph position="3"> (a) Morphological Analysis Tile morphological analyzer first constructs a word lattice for an input sentence by referring to the word dictionaries and the Japanese morphological grammar, and then produces a sequence of words from the lattice until the syntactic analyzer parses it snccessfully.</Paragraph>
      <Paragraph position="4"> Constants in tile source pattern of translation templates are stored in the 'template constant dictionary' used in the first phase of morphological analysis to create the word lattice. Fig. 4 shows a simplified example of a word lattice for sentence (1). Constants of transhltion templates in a word lattice should be selected if and only i\[&amp;quot; all the constants of a particuk~r template are selected simultaneously to form a valid sequence of words. In Fig. 4, we can obtain two valid word sequences froln the word lattice.</Paragraph>
      <Paragraph position="5"> The present implementation permits one al3plicable template for each source sentence. If more than one templates are applicable, the priority for each template is calculated based o,i the total length of constants and tile scope of the source sentence covered by the template, and a word seqt,ence is produced in the order of their priorities.</Paragraph>
      <Paragraph position="6">  (b) Syntactic Analysis When a translation template is applicable, the syntactic analyzer plays two roles. First is to analyze part of the word sequeuce which should be  variable. The second role is to derive a syntactic structure for the sentence.</Paragraph>
      <Paragraph position="7"> (c) TransFer and (;eneration  In the transfer phase, a translation template is transfornmd into a lexical transfer rule in the conventional form, so that the new matching pattern matches with the struelure produced by tile syntactic analyzer. The result of applying this rule is a target structure;* its direct constitt, ents are given tile word order and ready to output as a target sentence.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="26" end_page="28" type="metho">
    <SectionTitle>
3. Criteria for Using Translation Templates
</SectionTitle>
    <Paragraph position="0"> In principle, all translation can be described by translation templates. That is, users can make a lranslalion template by substituting corresponding expressions in source and target sentences wilh variables. The question is the appropriateness of te m plates.</Paragraph>
    <Paragraph position="1"> The fi~.'st criterion is its 'applicability.' ill the following cases, translation templates are inal)propriate because the sonrce pattern is too specific to be applied to other sentences.</Paragraph>
    <Paragraph position="2">  (C1) A source sentence is translated into two target sentences or a compound sentence.</Paragraph>
    <Paragraph position="3"> (C2) Two source sentences are translated into one t:,rget sentence.</Paragraph>
    <Paragraph position="4"> (C3) A source sentence contains a parenthesis or a  gapping.</Paragraph>
    <Paragraph position="5"> In slJcb cases, the source pattern may contain more conslants than that of the ordinary translation templates.</Paragraph>
    <Paragraph position="6">  example of (C3), where the source sentence contains a gapping. The source pattern created from this sentence will be of low applicability.</Paragraph>
    <Paragraph position="8"> (9) The preamplifiers are decentralized for 8 dements in P1 and for 24 elements in P2.</Paragraph>
    <Paragraph position="9"> Another criterion is the 'contextual independence.' It is often the case in Japanese-to-English translation that a zero-pronoun in a source sentence is resolved from the context and its translation equivalent appears in the target sentence. A translation template created from such translation may generate a contextually inappropriate translation.</Paragraph>
    <Paragraph position="10"> Note that these criteria are not absolute; templates which do not meet these criteria should be used if they lead to correct translation of other sentences. A statistical method could be introduced to objectively determine the appropriateness.</Paragraph>
    <Paragraph position="11"> 4, Conventional Customizing Functions This section briefly describes customizing functions which have been adopted ill our MT system\[3\]\[5\].</Paragraph>
    <Paragraph position="12">  Translation parameters are introduced to give preference or default interpretation in the translation process. In general, all of the processing are based on the system's linguistic knowledge, which is not open to users. For example, users cannot change the application order of syntactic rules used by the parser. Therefore the system derives the same syntactic tree for a given sentence to generate one particular translation. Translation parameters enable usm~ to partially control the t,'anslation process.</Paragraph>
    <Paragraph position="13"> One of the parameters used in Japanese-to-F.nglish translation treats subjectless sentences, which are common linguistic phenomena ill Japanese. With this parameter, users can specify the sentence type of a target sentence (imperative or declarative) and, if necessary, the voice and translation equivalents for the omitted subject (personal pronouns, &amp;quot;it&amp;quot; or a user-defined string). For example, sentence (10) is translated into sentences (11) to (15) according to the  specified parameter vah,es.</Paragraph>
    <Paragraph position="14"> (I0) sono botan-wo oshimasu the button-OllJ press (11) Press tim button.</Paragraph>
    <Paragraph position="15"> (I 2) Tile button is pressed.</Paragraph>
    <Paragraph position="16"> (13) I press the button.</Paragraph>
    <Paragraph position="17"> (14) It presses the bt, tton.</Paragraph>
    <Paragraph position="18"> (15) It presses the button.</Paragraph>
    <Paragraph position="19"> (imperative) (passive) (personal pronouns) (&amp;quot;it&amp;quot;) (&amp;quot;#&amp;quot; as user-defined string) * User-defined rules User-defined rules are used for representing knowledge to determine an appropriate translation equivalent for a source word (or an expression) by referring to its related words. There are three types of user-defined rt, les available: (R1) Rules for verbs (R2) P, ules for functional phrases (R3) Rules for conjunctional phrases  Rule (R1) determines a translation equivalent of a verb based on its case fillers. A translation for a functional phrase is determined based on its preceding noun vnd tile verb phrase it modifies, whereas a translation for a conjunctional phrase is based on its preceding verb phrase and the verb phrase it modifies. Additionally, rules (R2) and (R3) can specify where translation eqvivzdents for functional and con junctional phr.'lses are generated.</Paragraph>
    <Paragraph position="20"> Sentences (16) to (18) below show a customization example using a user-defined rule for a functional phrase. In sentence (17), which is the initial ot, tput by our system, the functional phrase &amp;quot;hi doukishite&amp;quot; is translated into a verb phrase. Contrast this with the customized sentence (18), in which tim phrase is translated into the prepositional phrase &amp;quot;in  (17) This cirenit genm,ates a pulse synclu-onizing with a signal.</Paragraph>
    <Paragraph position="21"> (18) This circuit genenltes a pulse in synchronism with  a signal.</Paragraph>
    <Paragraph position="22"> User-defined rules have limitations in that they cannot represent complex structural changes. However, this is intentionally designed to prevent mistranslation possibly caused by adding these structural rules into tile system's knowledge. Alternatively, the proposed framework has been introduced to represent knowledge for more complex trartslation.</Paragraph>
  </Section>
  <Section position="5" start_page="28" end_page="29" type="metho">
    <SectionTitle>
5. Evaluation of Customizability
5.1 Outline of Analysis
</SectionTitle>
    <Paragraph position="0"> To confirm tile effectiveness of translation templates, we analyzed a parallel text, namely a service mant, al on an electronic eqvipment written in Japanese and its English translation, and estimated the improvement in cvstomizability.</Paragraph>
    <Paragraph position="1"> The analysis was done as follows: (i) Translate the sot,rce sentences using the MT system, which is in the default state except that undefined words are registered in tile user dictionary.</Paragraph>
    <Paragraph position="2"> (ii) Compare the 'sentence structure' of the MT output in (i) aqd its cmresponding sentence in tile English manual, and find out sentences for customization.</Paragraph>
    <Paragraph position="3"> (iii) Categorize the above sentences according to the type of customization needed to translate them into sentences Itaving tile same sentence structures as the model translations.</Paragraph>
    <Paragraph position="4"> The 'sentence structure' used for judging tile necessity of customization includes tile following  Two different case frames are treated as the same as long as tile difference can be resolved with a user-defined word and/or a user-defined rule for verbs.</Paragraph>
    <Paragraph position="5"> * Voice of a main clause: active J passive If all of tile above are identical, tile MT output and the model translation are considered to have the same sentence structure. Othexwise, the MT system needs  custonfization. For example, sentences (2) and (3) are different in their sentence slrt,ctures because they have different case frames. Similarly, sentences (20) and (21), which are the MT oulpnt of sentence (19) and the model translatio,l respectively, are different in their sentence structlnes because of their different clause patterns and case fraules.</Paragraph>
    <Paragraph position="6"> (19) FMbu-niwa 2real-no fureemumemori-ga ari FMunit-in 2 framememory-SUllJ exist kotonaru 2tsu-no gazou-wo kioku-dekiru different 2 image-OIU can memorize (20) Two frame memories are in the FM unit and it can memorize two different images.</Paragraph>
    <Paragraph position="7"> (21) Tim FM unit has two frame memories tlmt can store two different i,nages.</Paragraph>
    <Section position="1" start_page="28" end_page="29" type="sub_section">
      <SectionTitle>
5.2 A nalysis Result
</SectionTitle>
      <Paragraph position="0"> We have analyzed 492 sentences excluding titles and figure captions. The average sentence length was 52 Kanji characters.</Paragraph>
      <Paragraph position="1"> Table I shows the overall result. Out of 492 sentences, 42% have tile same sentence structures as the model translations, while the remaining 58% ilave different sentence stnmtures and require customization of the system. The latter is further divided into fonr categories according to the type of customization needed to improve the MT output, as shown in Table 2. By the conventional customizing functions, namely, translation parametens and user-defined rules, 14% are customizable. In addition, translation telnplates can improve 45%, which suggests that 59% will improve in tolzd. This also means that, t,sing all customizing functions, 76% of the given sentences can be translated as in the Fnglish ,nanual, while only 51%, can be done so t,sing the conventional ft, nctions. Tlmse figures suggest that a translation template is  useful to deal with complex translation.</Paragraph>
      <Paragraph position="2"> Sentences which cannot be ct, stomized are divided into four categories:  First, a translation parameter does not work when the condition on its application is not customizable. One example is a translation parameter of sentence types for enumerated items. If the system can recognize such a specific form, its translation can be customized. Otherwise tile specified parameter is not used.</Paragraph>
      <Paragraph position="3"> Second, an extended syntax for translation templates is needed to represent more complex translation. An example is to extend the syntax so that conversion of grammatical categories, such as nominalization of verb phrases, can be specified. Third, translation templates are not utilized in light of the criteria explained in 3. The statistics of the rejected sentences is as follows.</Paragraph>
      <Paragraph position="4">  A translation template proposed in this paper is more flexible than others due to variables to match with 'verb phrases' and 'clauses.' Basically, a pattern matching approach like the template-based translation has a disadvantage on word order when it is applied to a language that has relatively free word order like Japanese. This problem is partially solved by using these variables because the word order of the constituents of verb phrases and clauses is not fixed. * Appropriateness of translation templates The question about the appropriateness of a translation template is also raised in case of a translation example in Example-based Machine Translation (EBMT). It is easy to measure the system performance, but is difficult to evaluate tile appropriateness of examples based on their amount and the performance. Tbis issue Ires been ignored so far. Our criteria will be the first approacb to this issue. Although every translation can be described using translation templates, some criteria to determine its appropriateness should be provided because without them automatic template learning will soon lead to tile explosion of the template database.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML