File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/96/c96-1070_metho.xml
Size: 24,087 bytes
Last Modified: 2025-10-06 14:14:11
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1070"> <Title>Incremental Translation Utilizing Constituent Boundary Patterns</Title> <Section position="4" start_page="0" end_page="412" type="metho"> <SectionTitle> 2 Translation strategy </SectionTitle> <Paragraph position="0"> In TDMT, translation is performed by applying stored empirical transfer knowledge, which de- null scribes the correspondence between source language expressions and target language expressions at various linguistic levels. The source and target expressions of tile transfer knowledge in TDMT arc ext)ressed by constituent boundary patterns, which represent meaningful units for linguistic structure and transfer. An efficient application of transfer knowledge source parts to an input string plays a key role in achieving quick translation.</Paragraph> <Paragraph position="1"> The procedure R)r applying constituent boundary patterns is perfomed after the assignment of morphological information to each word of an input string, and is as follows: (a) Insertion of constituent boundary marker; (b) 1)eriw~tion of possible structures; (e) Structural disambiguation 1)y semantic dislance calculation.</Paragraph> <Paragraph position="2"> In the top-down and breadth-tirst pattern application, the above procedure is executed in the described order. Because the selection of the best structure might have to be postponed until all possible structm'es are derived, the costs of translation could be high.</Paragraph> <Paragraph position="3"> In contrast, the incremental method determines the best structure locally and (-an constrain the number of competing structures fbr the whole input by performing (b) in l)arallel with (c); consequently, translation costs are reduced.</Paragraph> <Paragraph position="4"> The structure selected in (c) (:ontains its transt~rred result and head word infbrination, which is used for semantic distance calculation when combining with other structures. The output sentence is generated as a translation result Dora the structure for the whole inl)ut, which is composed of best-first substructures.</Paragraph> <Paragraph position="5"> In the three subsequent sections, we will explain (a), (b), and (c), focusing on the bottoir>up and best-first translation strategy.</Paragraph> </Section> <Section position="5" start_page="412" end_page="412" type="metho"> <SectionTitle> 3 Constituent boundary pattern </SectionTitle> <Paragraph position="0"> In this section we will briefly explain how constituent boundary patterns are used to describe the structure of an int)ut string in TI)MT and what procedures arc applied before constituent boundary pattern applications (Furuse, 1994b).</Paragraph> <Paragraph position="1"> We will show bottom-up pattern application by translating the following sample English sentence into Japanese: Thc bus goes to Chinatown at ten a.m.</Paragraph> <Paragraph position="2"> First, all the words in this sequence are assigned the following parts-of-speech.</Paragraph> <Paragraph position="3"> article, noun, verb, preposition, proper-noun, preposition, numeral, postnominal A constituent boundary pattern is defined as a sequence that; consists of variables and symbols representing constituent boundaries. A variable corresponds to some linguistic constituent and is expressed as a capital letter (e.g. X).</Paragraph> <Paragraph position="4"> A constituent boundary is expressed by either a functional word or a part-of-speech bigram marker (e.g. noun-verb). Variables in tile source language expression must be separated by constituent boundaries.</Paragraph> <Paragraph position="5"> For instance, the expression &quot;goes to Chinatown&quot; is divided into two constituents, i.e. &quot;goes&quot; and &quot;Chinalown&quot;. The preposition &quot;1o&quot; can be identified as a constituent boundary. Therefor(;, in parsing &quot;goes to Chinatown&quot;, we use the pattern &quot;X to Y', which has two variables X and Y and a constituent boundary &quot;to&quot;.</Paragraph> <Paragraph position="6"> 'l'he expression &quot;the b~zs goes&quot; can be divided into two constituents &quot;the bud' and &quot;goes&quot;. Howeve.r, there is no flmctional surface word that divides the expression into two constituents. In such ('ases, we emt)loy part-of-speech bigrams as boundary markers. &quot;bus&quot; and &quot;goes&quot; are a noun and a verb, respectively. Thus the marker noun-verb can be inserted as a boundary marker into the input &quot;the bus goes&quot;, giving &quot;The bus noun-verb goes&quot;. This sequence will now match tile general transfer knowledge pattern &quot;X noun-verb Y&quot;.</Paragraph> <Paragraph position="7"> Of the possible bigrams in the above part-of~ speech sequence, only &quot;noun-verb&quot; is an eligible constituent boundary marker (Fro:use, 1994b). This marker is inserted into the above sentence: The bus noun-verb goes to Chinatown at ten a.m.</Paragraph> <Paragraph position="8"> Indices to possible patterns are obtained from several words and bigrams in the above m~rkerinserted string (Table 1).</Paragraph> <Paragraph position="9"> retrieved pattern (linguistic level) the X ((:ompound noun) X noun-verb Y (simple sentence) X to Y (verb phrase, noun phrase) X at Y (verb phrase, noun phrase) X a.ra. (compound noun) The procedure ext)lained so far is the part that; the top-down and bot;tom-up pattern application methods have in common.</Paragraph> </Section> <Section position="6" start_page="412" end_page="414" type="metho"> <SectionTitle> 4 Incremental pattern application </SectionTitle> <Paragraph position="0"> In this section, we will show the application of constituent boundary patterns based on the concept of bottom-up chart parsing.</Paragraph> <Section position="1" start_page="412" end_page="413" type="sub_section"> <SectionTitle> 4.1 Linguistic level </SectionTitle> <Paragraph position="0"> In order to limit the combinations of patterns during pattern application, we distinguish pattern levels and for each linguistic level, we specify the linguistic sublevels which are permitted to be used in the assigned variables.</Paragraph> <Paragraph position="1"> Table 2 shows examples of the relationships between linguistic levels. A variable on a given level is instantiated by a string on the lingustic levels in the second column of Table 2. For instance, in the noun phrase &quot;X of F', the variables X and Y cannot be instantiated by a simple sentence, but can be instatiated by a noun phrase, a compound noun, and so on.</Paragraph> <Paragraph position="2"> linguistic level sublevels of variables simple sentence VP, NP, ...</Paragraph> <Paragraph position="3"> verb phrase (VP) VP, NP, verb, ...</Paragraph> <Paragraph position="4"> noun phrase (NP) NP, CN, proper-noun ....</Paragraph> <Paragraph position="5"> compound noun (CN) CN, noun,...</Paragraph> <Paragraph position="6"> According to the regulation of the linguistic levels' relations shown in Table 2, a marker-inserted string is parsed using the constituent boundary patterns.</Paragraph> </Section> <Section position="2" start_page="413" end_page="413" type="sub_section"> <SectionTitle> 4.2 Active and passive arcs </SectionTitle> <Paragraph position="0"> A chart parsing method (Kay, 1980) can avoid repeatedly recomputing partial results and achieve incremental processing by using a bottom-up and left-to-right strategy. In chart parsing, an input string is parsed by combining active and passive arcs. These can be assigned to a substring of an input string when a pattern is applied to it. If all the variables of the applied pattern are instantiated or a substring can be matched to a pattern whose variables are all instantiated, a passive arc is created for the substring. When a substring can be matched to the left part of a pattern and the right variables of the pattern are not instatiated, an active arc is created for the substring.</Paragraph> <Paragraph position="1"> In conventional chart parsing, many arcs can be created because every word can create active and passive arcs based on its part-of-speech.</Paragraph> <Paragraph position="2"> Also, many arcs can be chained via non-terminal symbols such as a part-of-speech and NP (noun phrase). For instance, the pronoun, &quot;f' can create many active arcs relevant to the rules &quot;Pronoun 1&quot;, &quot;NP ~ Pronoun&quot; and &quot;S --+ NP VP&quot;, which can be chained. Therefore, a lot of computation is required in conventional chart parsing.</Paragraph> <Paragraph position="3"> In contrast, chart parsing with constituent boundary patterns can constrain the number of arc creations because only an constituent boundary creates active arcs while a variable (e.g. X) never creates an arc. We obtain indices to patterns from each word of the sentence. With these indices, patterns are retrieved and checked to determine whether each of them can create an arc.</Paragraph> </Section> <Section position="3" start_page="413" end_page="414" type="sub_section"> <SectionTitle> 4.3 Pattern application algorithm </SectionTitle> <Paragraph position="0"> Our algorithm for bottom-up application of patterns is as follows. If the whole input string can be covered with a passive arc, the parsing will succeed and the derivation of the passive arc will be the parsed result.</Paragraph> <Paragraph position="1"> 1. If the processed string is a content word (e.g. noun, verb) create a passive arc.</Paragraph> <Paragraph position="2"> 2. If the processed string is a constituent bound- null ary &quot;a&quot;, create each kind of arc as follows, according to the pattern I retrieved from the constituent boundary.</Paragraph> <Paragraph position="3"> 2a. If the retrieved pattern is of the type &quot;X a Y&quot; and a left-neighboring passive arc can satisfy the condition for X's instantiation, create an active arc for &quot;X a F', in which Y has not yet been instantiated.</Paragraph> <Paragraph position="4"> 2b. If the retrieved pattern is of the type &quot;X a&quot; and a left-neighboring passive arc can satisfy the condition for X's instantiation, create a passive are for &quot;X a&quot;.</Paragraph> <Paragraph position="5"> 2c. If the retrieved pattern is of the type &quot;a ~', create an active arc for &quot;a ~'.</Paragraph> <Paragraph position="6"> 3. If the created passive arc satisfies the leftmost part of an uninstantiated variable in the pattern of neighboring active arcs, the variable is instantiated with the passive arc, and a new passive or active arc is created. If a passive arc is generated in this operation, repeat the procedure until a new arc can no longer be created.</Paragraph> <Paragraph position="7"> Figure 1 shows how an input string is parsed using our bottom-up chart method. A solid line denotes a passive arc that covers a substring of the input below, while a dotted line denotes an active arc.</Paragraph> <Paragraph position="8"> The content words &quot;bus&quot;, &quot;goes&quot;, &quot;Chinatown&quot; and &quot;ten&quot; create passive arcs. The functional word &quot;the&quot;, which is relevant to the pattern &quot;a X&quot;, creates an active arc. The assignment of the functional word &quot;a.m.&quot; to the pattern &quot;X a&quot; creates a passive arc by combining another passive arc. The boundary markers &quot;noun-verb&quot;, &quot;to&quot; and &quot;at&quot;, which are relevant to the pattern &quot;X a Y&quot;, create active arcs by combining left-neighboring passive arcs.</Paragraph> <Paragraph position="9"> First &quot;the&quot; creates the active arc (1) relevant to the pattern &quot;the X&quot;. &quot;bug' creates the passive arc (2). The passive arc (3) is created by combining (1) and (2). &quot;noun-verb&quot; creates the active arc (4), whereby the variable X of &quot;X noun-verb F' is matched against (3). &quot;bus&quot; creates the passive are (5), and the passive arc (6) is created by combining (4) and (5). &quot;to&quot; creates the active are (7), whereby the variable X of &quot;X to ~' at verb phrase is matched against (5).</Paragraph> <Paragraph position="10"> 1There are other types of patterns, such as &quot;X a Y fl ~', where ce and /3 are constituent boundaries. They can be easily processed by slightly extending the algorithm.</Paragraph> <Paragraph position="11"> We continue the procedure incrementally.</Paragraph> <Paragraph position="12"> When the rightmost word has been processed, the derivation of the passive arc of the whole input gives the parsed result, in our example the derived process of the passive arc (20), which is the combination of (4) and (19).</Paragraph> </Section> </Section> <Section position="7" start_page="414" end_page="415" type="metho"> <SectionTitle> 5 Preference of substructure </SectionTitle> <Paragraph position="0"> The passive arc (19), which is relevant to &quot;goes to Chinatown at ten a.m.&quot;, h~ two competing rcsuits. One is the combination of (7) and (18), where &quot;X at F' is a noun phrase. The other is the combination of' (12) and (17), where &quot;X at 1(&quot;' is a verb phrase. Thus, (19) has two possible structures by the application of &quot;X at F'. &quot;X to F' at the verb phrase level and &quot;X a.m.&quot; at the compound noun level are also applied.</Paragraph> <Paragraph position="1"> The technique for obtaining substructure preference is the determination of the best substructure when a relative passive arc is created. Only the best substructure can be retained and combined with other arcs.</Paragraph> <Section position="1" start_page="414" end_page="414" type="sub_section"> <SectionTitle> 5.1 Semantic distance </SectionTitle> <Paragraph position="0"> The most appropriate st~ructure is selected by computing tile total sum of all possible combinations of partial semantic distance values. The structure with the least total distmme is judged most consistent with empirical knowledge and is chosen as the most plausible structure.</Paragraph> <Paragraph position="1"> The semantic distance between words is calculated according to the relationship of tim positions of words' semantic attributes in the thesaurus.</Paragraph> <Paragraph position="2"> The distance between expressions is the sum of the distance between the words comprising the expressions, multiplied by some weights (Sumita, 1992).</Paragraph> </Section> <Section position="2" start_page="414" end_page="414" type="sub_section"> <SectionTitle> 5.2 Head word information </SectionTitle> <Paragraph position="0"> The head words within variable bindings serve as input for distance calculations. An input for distance calculation consists of head words in variable parts. The head part is designated in each pattern. Table 3 shows the head parts of the possible substructures for &quot;goes to Chinatown at ten a.m.&quot;, which corresponds to the passive arc (19).</Paragraph> <Paragraph position="2"> In &quot;X at F' for the substring &quot;goes to Chinatown at ten a.m&quot; combined with (12) and (17), the variables X and Y are substituted for the compound expressions &quot;goes to Chinatown&quot; and &quot;ten a.m.&quot;, respectively. Thus, in &quot;X at Y&quot; for the structure in (19), the input for distance calculation is &quot;goes&quot; for &quot;3;&quot;' and &quot;a.m.&quot; for &quot;Y&quot;. Since the head of &quot;X at Y&quot; is designated as &quot;X', &quot;goes&quot; becomes the \[lead word for (19). This information is used when (19) is combined with another substring.</Paragraph> </Section> <Section position="3" start_page="414" end_page="415" type="sub_section"> <SectionTitle> 5.3 Structure selection </SectionTitle> <Paragraph position="0"> The difference in total distance value between the two possible structures is due only to the distance value of &quot;X at F'. Table 4 shows the results of the distance calculation in &quot;X at Y&quot; for the combination of (7) and (18), and for that of (12) and (17).</Paragraph> <Paragraph position="1"> (goes, a.m.) expresses the bindings for variables X and Y, where X =&quot;goeg', and Y =&quot;a.m.&quot;. &quot;X'&quot; is the target expression corresponding to &quot;~'.</Paragraph> <Paragraph position="2"> noun phrase verb phrase (Chinatown, a.m.) (goes, a.m.) (morning, a.m.) (depart, a.m.) V no X' V ni X' 0.50 0.21 According to the distance calculation in the combination of (7) and (18), &quot;I/' no 3;'&quot;, with the distance value 0.50, is selected as a target expression. In the combination of (12) and (17), &quot;Y' ni X'&quot; with the distance value 0.21 is selected as a target expression. Thus, the combination of (12) and (17) is selected as the structure of the passive arc (19). Based on the results of distance cab culations, other partial source patterns for (19), &quot;X to Y&quot; and &quot;X a.m&quot;, are transferred to &quot;Y' ni 3('&quot; with the distance value 0.12, and &quot;gozen X ~ jt' with the distance value 0.00. Thus, the passive arc (19) has its source and target structure through the combination of (12) and (17), the total distance value 0.33, and the head word &quot;goes&quot;. Then, the structure of the whole input string, which corresponds to (20), is constructed by combining (19) with (4). In this combination, &quot;X noun-verb Y' matches the input string and is transferred to &quot;X' wa Y'&quot; based on the result of distance calulation. From the combined structure for (20), the sentence below is generated after adjustment necessary for Japanese grammar. The words &quot;bus&quot;, &quot;goes&quot;, and &quot;Chinalown&quot; are transferred to &quot;basu&quot;, &quot;iku&quot;, and &quot;Chainalaun ''2, respectively. null Basu wa gozen i0 ji ni Chainataun ni iki masu &quot;ik~&quot; is the conjugated form of &quot;iku&quot; followed by masu, a polite sentential-final form.</Paragraph> </Section> </Section> <Section position="8" start_page="415" end_page="416" type="metho"> <SectionTitle> 6 Preliminary Experiment </SectionTitle> <Paragraph position="0"> In this section, we perform Fmglish-to-Japanese translation to compare the efficiency of the top-down pattern application with that of our new method, based on the bottom-up application and substructure preference in the TDMT prototype system.</Paragraph> <Section position="1" start_page="415" end_page="415" type="sub_section"> <SectionTitle> 6.1 TDMT prototype system </SectionTitle> <Paragraph position="0"> The TDMT prototype system, whose domain is travel conversations, is designed to achieve 2The prototype system assigns a default target expression to a surface source expression. Another target expression is selected when a specific example in the transfer knowledge is closest to the input.</Paragraph> <Paragraph position="1"> multi-lingual spoken-language translation (Furuse, 1995). While language-oriented modules, such as morphological analysis and generation, are provided to treat multi-lingual translation, the transfer module, which is a central component, is a common part of the translation system for every language pair. The system is written in LISP and runs on a UNIX machine. Presently, the prototype system can translate bilingually between Japanese and English and between Japanese and Korean. In English-to-Japanese translation, the present vocabulary size is about 3,000 words 3 and the number of training sentences is about 2,000.</Paragraph> </Section> <Section position="2" start_page="415" end_page="416" type="sub_section"> <SectionTitle> 6.2 Experimental results </SectionTitle> <Paragraph position="0"> We have compared translation times in the TDMT prototype system for two cases. One case utilizes top-down application; the other case utilizes the new application method presented in this paper, which adopts bottom-up pattern application and retains only one substructure using semantic distance calculation. The translation times are measured using a Spare10 workstation.</Paragraph> <Paragraph position="1"> We have experimented with the translation times of some English sentences into Japanese.</Paragraph> <Paragraph position="2"> The following sentences cause only minor structural ambiguity. Note that a comma is not used in the input sentence, because it is assumed to be a spoken-language input such as the output of speech recognition.</Paragraph> <Paragraph position="3"> (1) 1 have a reservation for tomorrow.</Paragraph> <Paragraph position="4"> (2) Will my laundry be ready by tomorrow? (3) You can walk there in about three minutes.</Paragraph> <Paragraph position="5"> (4) Then may I have your credit card number please? Table 5 shows the translation time of the above sentences. For these translations, not much difference could be seen between the new bottom-up method and the top-down method. For such inputs TDMT can quickly produce the same translation results with either method.</Paragraph> <Paragraph position="6"> The following sentences cause much structural ambiguity because of PP-attaehment, relative clauses, conjunctions, etc.</Paragraph> <Paragraph position="7"> (5) 7'his sales clerk doesn't understand anything 1 say and i'm wondering if you wouhl help me explain what \[ want.</Paragraph> <Paragraph position="8"> (6) Could I please have your name the date of arrival and the number of persons in your party? (7) 7bll somcone at the, fl'ont desk what game you want to scc and what type of seat you want and they'll get the tickets for you.</Paragraph> <Paragraph position="9"> (8) I h,fl somc laundry to be cleaned bul I can't remember where the clcaners is and I was wondering if you could help me.</Paragraph> <Paragraph position="10"> Table 6 shows the translation time of the above sentences, hi the above translations the same translation results could again be obtained for both methods, llowever the new method can achieve a far more efficient translation than the Average tramslation times in the top-down method were 1.15 seconds for a 10-word input and 10.87 seconds for a 20-word input. Average translation times in the bottom-up method were 0.55 se(:onds for a 10-word input and 2.04 seconds for a 20-word inl)ut. The translation time in the top-down method is considere, d to t)e (:h)sely relate(l to the nnmber of possibh~ stru(;tures, while l,he translation time in our new method is not direcdy retle(-ted by this number. The inc.rease in the. number of substructures retained will, the. new method is much smaller than that of the number of possible structures in the top-down method. Therefore, our new method can efficiently translate a longer input string having many (-ompeting structures. Also, we have performed a small translationquality experiment on the two pattern application methods with the 95 untrained sentences within the system's vocabulary. Both the tOl)-down method and the proposed bottom-up method gave the correct translation \[br the same 60 sentences with a success rate of 63.2%. ~'o,. only two sentences, difl>rent structures we.re produced by the two methods; however, all of them were incorrect translations. This experimental result shows that our new translation strategy maintains translation quMity.</Paragraph> <Paragraph position="11"> Similar results, which show the llSe~llhlesS of the new TI)MT tbr spokenJanguage translation, were obtained in other tyl)es of translation such as Jal)anese-to-English (or,-Korean) translation.</Paragraph> </Section> </Section> class="xml-element"></Paper>