File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/w02-0712_metho.xml
Size: 16,971 bytes
Last Modified: 2025-10-06 14:08:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-0712"> <Title>Automatic Interpretation System Integrating Free-style Sentence Translation and Parallel Text Based Translation</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 The Integration Model </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 User Interface </SectionTitle> <Paragraph position="0"> Although parallel text based translation provides a correct result, the registered parallel bilingual sentences cannot cover all possible utterances by the user in the target domain. Free-style sentence translation, on the contrary, accepts free-style input sentences but provides no guarantee as to the quality of results.</Paragraph> <Paragraph position="1"> For many routine situations, users will clearly benefit from using parallel text based translation.</Paragraph> <Paragraph position="2"> In such cases, the system will probably include a sentence that totally or partially fits what they want to say. To ensure high translation reliability, users should use free-style sentence translation only for utterances not covered by the registered sentences.</Paragraph> <Paragraph position="3"> However, users usually will not know what sentences are registered in the system and will have to search for an appropriate sentence before they can use parallel text based translation. In some cases, the user will be forced to use free-style sentence translation if unable to find an appropriate sentence.</Paragraph> <Paragraph position="4"> A seamless user interface that allows the user to easily switch between free-style sentence translation and parallel text based translation is therefore needed in a system integrating these two forms of translation. Two conditions in particular had to be met to make the system easy to use.</Paragraph> <Paragraph position="5"> 1. The user should be able to use an input sentence seamlessly as both a source sentence for free-style sentence translation and a key sentence for registered sentence retrieval.</Paragraph> <Paragraph position="6"> 2. The user should be able to use each sentence included in the results of the registered sentence retrieval and the input sentence as a source sentence for translation. (The former would be used for parallel text based translation, and the latter would be used for free-style sentence translation.)</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Content of Registered Sentences </SectionTitle> <Paragraph position="0"> Registered sentences must cover the utterances necessary for accomplishing typical tasks in the target domain to provide correct translation for minimal communication. In a translation system for overseas travelers, some typical tasks are changing money, checking in at a hotel, and ordering at a restaurant.</Paragraph> <Paragraph position="1"> We adopted a three-tier model that consists of scenes, tasks, and subtasks to prepare a sufficient set of necessary sentences to be registered in the system. A scene comprised a place or situation that corresponds to where a traveler is likely to be (e.g., a hotel) and a problem that could arise. We made a list of typical travelers' tasks that would be necessary in various travel scenes, divided each task into smaller primitive tasks (subtasks), and assigned a sentence template to each subtask based on the model.</Paragraph> <Paragraph position="2"> In general, more than one round of conversation is necessary to accomplish each task. We assumed that a task would consist of smaller subtasks, each of which would correspond to one round of conversation that consisted simply of an utterance from a traveler to a respondent and a response from the respondent to the traveler. For example, the task of checking in to a hotel consists of subtasks such as giving your name, confirming your departure date, and so on. Each subtask should be the smallest unit of a task because users cannot use a registered sentence effectively if it includes more than what they want to say.</Paragraph> <Paragraph position="3"> In this way, only one sentence template is needed for each subtask with regard to an utterance from a traveler to a respondent. For example, we can assign a sentence template of &quot;I'd like to have ....&quot; to the subtask of ordering a dish in a restaurant. We can provide a sufficient number of sentences by enabling the user to fill in the part denoted as &quot;...&quot; (referred to as a slot) with words applicable to the situation.</Paragraph> <Paragraph position="4"> Table 1 shows examples of scenes, tasks, subtasks, and sentence templates. An underlined part represents a slot. We define a list of words individually for each slot.</Paragraph> <Paragraph position="5"> For each task, both the utterances from a traveler to a respondent and the responses from a respondent to a traveler are significant. Responses should also be supported by parallel text based translation to ensure reliable communication. However, inputting the response and retrieving a registered sentence that matches it will be difficult and time consuming for the respondent who is unlikely to be familiar with the translation system.</Paragraph> <Paragraph position="6"> We use a system that presents a menu of responses for the respondent to choose from. The system keeps typical responses in parallel bilingual form for each registered sentence that the traveler can use and displays these as candidate responses when the traveler uses the sentence. The system then shows the traveler the translation of the response selected by the respondent.</Paragraph> <Paragraph position="7"> This approach enables travelers to obtain a reliable response and also enables respondents to easily select an appropriate response.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.3 Retrieval System </SectionTitle> <Paragraph position="0"> The retrieval system to search for a registered sentence that we use is based on a combination of three conditions -- the natural language sentence, scene, and action.</Paragraph> <Paragraph position="1"> Registered sentence retrieval based on a natural language sentence is essential for seamless integration of free-style sentence translation and parallel text based translation. We used a simple keyword-based retrieval system for registered sentence retrieval. This system extracts keywords from an inputted natural language sentence, searches for sentences including the keywords, and presents the results ranked mainly by the number of keywords included in each sentence.</Paragraph> <Paragraph position="2"> The system retrieves all sentences including more than one keyword to reduce the chance of an appropriate sentence not being retrieved. We overcame the increased retrieval noise in the result by applying an additional retrieval system to search for registered sentences in terms of the scene and action.</Paragraph> <Paragraph position="3"> Each registered sentence to be retrieved for translation corresponds to a set of a scene, a task, and a subtask as described in the previous section. A scene represents a place or a situation where the user wishes to accomplish the task and the subtask. A task and a subtask represent a user's actions. This means that the user's utterance is related to the user's intention regarding where (scene) the user wants to do something (action).</Paragraph> <Paragraph position="4"> We use the additional retrieval system in situations where the user has to search for sentences from</Paragraph> <Paragraph position="6"> The number of scenes where travelers are likely to have a conversation is limited and can be systematically classified regarding places such as an airport, a hotel, or a restaurant.</Paragraph> <Paragraph position="7"> We provide a directory-type search system that can be used to search for sentences by scene. We built up the travel-scene directory tree and assigned sentences to the leaf nodes of the tree. When the user selects a scene in the tree, sentences belonging to that scene are presented to the user. The selected scene does not change until the user selects another scene in this search system since the user generally will not move to a different scene while talking.</Paragraph> <Paragraph position="8"> 2) Search by action Since it is difficult to represent actions with key-words and a traveler's range of probable actions in overseas travel is limited, we also provided a directory-type search system to search for sentences by action. We constructed a directory tree of traveler actions, and the user can obtain the sentences used for an action by selecting the action from the tree.</Paragraph> <Paragraph position="9"> By inputting a natural language sentence and selecting a scene and an action, the user can obtain sentences that include the keywords extracted from the input sentence and that match the selected scene and action. When the user selects a different scene or action, the system again searches through the registered sentences using the new condition regarding the scene or action along with the original condition that was not changed. This enables the user to dynamically adjust the search conditions.</Paragraph> </Section> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Prototype System </SectionTitle> <Paragraph position="0"> We have integrated free-style sentence translation (Watanabe et al., 2000) and parallel text based translation based on the model described in the previous section and built a new prototype system. Here, we describe the system configuration, the contents of the registered sentences in the system, and the scene and action directories. We also explain how the user operates the system interface.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 System Configuration </SectionTitle> <Paragraph position="0"> The prototype system consists of six components -- speech recognition, machine translation, registered sentence retrieval, parallel text based translation, registered sentence database, speech synthesis (Figure 1). We have utilized the speech recognition, machine translation, and speech synthesis described in (Watanabe et al., 2000). The registered sentence retrieval component searches the registered sentence database using the system input. The parallel text based translation component produces a translation of the registered sentence selected by the user from the search results provided by the registered sentence retrieval component. The system input can be used as the machine-translation target and as a search key for registered sentence retrieval accord- null for the sentence in Figure 2</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Registered Sentences </SectionTitle> <Paragraph position="0"> We first listed a traveler's typical tasks in eleven scenes where travelers often have to speak to people and then made a list of typical subtasks by analyzing the process necessary to accomplish each task. Next, we composed a sentence template for each subtask and a list of typical words that could be inserted into each slot of the templates. We have composed 2590 templates, which can be used to generate 7410 sentences with the slot word-lists, and have installed these in the system.</Paragraph> <Paragraph position="1"> We have also composed 1185 templates, which can be used to generate 1796 sentences through slot word expansion, as response candidates for the respondent. Sharing a set of response candidates among several sentences for the traveler decreases the total number of response templates needed. A set of response candidates is linked to every sentence for the traveler to which the respondent can respond.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Scene and Action Directories </SectionTitle> <Paragraph position="0"> For each of the eleven scenes, we listed the relevant tasks in a two-layered tree with 70 leaf nodes to create a scene directory. Table 2 shows the top layer nodes of the scene directory.</Paragraph> <Paragraph position="1"> We used only the six actions listed in Table 3 for the action directory and constructed a one-layered tree since it is difficult for the user to select the action if actions are classified in detail.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.4 User Interface </SectionTitle> <Paragraph position="0"> Figure 2 shows the display screen of the prototype system. In this example, the user inputs the Japanese sentence &quot;Kono hoteru kara k-uk-o ni iku basu wa arimasuka. (Is there a bus going to the airport from this hotel?)&quot; by speaking. The result of the speech recognition is displayed in the input window at the center of the screen.</Paragraph> <Paragraph position="1"> When the user clicks the &quot;kensaku jikk-o (search)&quot; button in the screen, the system searches among the registered sentences using the input sentence as a key and displays the search result under the input window (Figure 3). The sky-blue color of the background in the window indicates the sentence selected as the target for translation. The user can select another sentence including the input sentence by clicking it.</Paragraph> <Paragraph position="2"> When the user clicks the &quot;honyaku (translate)&quot; button after selecting the first registered sentence in Figure 3, the system retrieves an English translation of the sentence registered with the Japanese sentence, displays it (Figure 4), and reads it through the speech synthesis.</Paragraph> <Paragraph position="3"> If the user cannot find an appropriate sentence in the search results, the user can resort to free-style sentence translation. When the user clicks the &quot;honyaku (translate)&quot; button after selecting the input sentence, the system translates it into English through the machine translation, displays it (Figure in Figure 8 5), and reads it through the speech synthesis.</Paragraph> <Paragraph position="4"> In this way, the user can use free-style sentence translation and parallel text based translation seamlessly for the same input sentence.</Paragraph> <Paragraph position="5"> Next, we explain how a user can narrow down the search result by using the directory.</Paragraph> <Paragraph position="6"> Figure 6 shows the system display when the user inputs the Japanese sentence &quot;Kozeni o irete kudasai. (I'd like some small change.)&quot; by speaking and searches for a matching registered sentence. The search result is displayed in the lower central window. In this case, no appropriate sentence appears among the higher ranking sentences.</Paragraph> <Paragraph position="7"> In Figure 6, the scene directory is displayed in the left part of the window. When the user selects the scene &quot;Denwa * Y-ubin * Gink-o (Telephone / Mail / Bank)&quot;, the search result is narrowed down to the sentences associated with the scene in &quot;Telephone / Mail / Bank&quot; (Figure 7). The registered sentence &quot;Kozeni o mazete itadakemasuka. (I'd like some small change.)&quot; is then displayed at the second of the results, and the user can use this sentence for translation.</Paragraph> <Paragraph position="8"> The user can similarly narrow down the result with the action directory. If necessary, the user can also use a combination of the scene and the action directories.</Paragraph> <Paragraph position="9"> We next explain how a respondent can respond by selecting from among the response candidates registered in the system.</Paragraph> <Paragraph position="10"> Figure 8 shows the screen of the system when the user selects the registered Japanese sentence &quot;Kore wa ch-umon to chigaimasu. (This is not what I ordered.)&quot; and translates it. The English translation is displayed in the upper central window of the screen, and the response candidates are listed in the lower central window.</Paragraph> <Paragraph position="11"> When the respondent selects the first response from these and clicks the &quot;Trans&quot; button, the system displays Japanese translation of the response (Figure 9) and reads it through the speech synthesis.</Paragraph> <Paragraph position="12"> In this way, the respondent can easily respond to the traveler by selecting a response from among the provided candidates when the traveler uses a registered sentence. The traveler can thus fully understand the response.</Paragraph> </Section> </Section> class="xml-element"></Paper>