File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/c04-1015_intro.xml
Size: 5,059 bytes
Last Modified: 2025-10-06 14:02:05
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1015"> <Title>Example-based Machine Translation Based on Syntactic Transfer with Statistical Models</Title> <Section position="3" start_page="0" end_page="11" type="intro"> <SectionTitle> 2 Example-based Syntactic Transfer </SectionTitle> <Paragraph position="0"> The example-based syntactic transfer used in this paper is a revised version of the Hierarchical Phrase Alignment-based Translator (HPAT, refer to (Imamura, 2002)). This section gives an overview with an example of Japanese-to-English machine translation.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Transfer Rules </SectionTitle> <Paragraph position="0"> Transfer rules are automatically acquired from bilingual corpora by using hierarchical phrase alignment (HPA; (Imamura, 2001)). HPA parses bilingual sentences and acquires corresponding syntactic nodes of the source and target sentences.</Paragraph> <Paragraph position="1"> The transfer rules are created from their node correspondences. Figure 3 shows an example of the transfer rules. Variables, such as X and Y in Figure 3, denote non-terminal symbols that correspond between source and target grammar. The set of transfer rules is regarded as synchronized context-free grammar.</Paragraph> <Paragraph position="2"> The difference between this approach and conventional synchronized context-free grammar is that source examples are added to each transfer rule. The source example is an instance (i.e., a headword) of the variables that appeared in the training corpora. For example, the source example of Rule 1 in Figure 3 is obtained from a phrase pair of the Japanese verb phrase &quot;furaito (flight) wo yoyaku-suru (reserve)&quot; and the English verb phrase &quot;make a reservation for the flight.&quot;</Paragraph> </Section> <Section position="2" start_page="0" end_page="11" type="sub_section"> <SectionTitle> 2.2 Syntactic Transfer Process </SectionTitle> <Paragraph position="0"> When an input sentence is given, the target tree structure is constructed in the following three steps.</Paragraph> <Paragraph position="1"> 1. The input sentence is parsed by using the source grammar of the transfer rules. 2. The nodes in the source tree are mapped to the target nodes by using transfer rules. 3. If non-terminal symbols remain in the leaves of the target tree, candidates of translated words are inserted by referring to the translation dictionary. null An example of the syntactic transfer process is shown in Figure 4 for the input sentence &quot;basu wa 11 ji ni de masu (The bus will leave at 11 o'clock).&quot; There are two points worthy of notice in this figure. First, nodes in which the word order is inverted are generated after transfer (cf. VP node represented by a bold frame). Word re-ordering is achieved by syntactic transfer. Second, words No. Source Grammar Target Grammar Source Example</Paragraph> <Paragraph position="3"/> <Paragraph position="5"/> <Paragraph position="7"> (Bold frames are syntactic nodes mentioned in text) that do not correspond between the source and target sentences (e.g., the determiner 'a'or'the') are automatically inserted or eliminated by the target grammar (cf. NP node represented by a bold frame). Namely, transfer rules work in a manner similar to the functions of distortion, fertility, and NULL in IBM models.</Paragraph> </Section> <Section position="3" start_page="11" end_page="11" type="sub_section"> <SectionTitle> 2.3 Usage of Source Examples </SectionTitle> <Paragraph position="0"> Example-based transfer utilizes the source examples for disambiguation of mapping and parsing.</Paragraph> <Paragraph position="1"> Specifically, the semantic distance (Sumita and Iida, 1991) is calculated between the source examples and the headwords of the input sentence, and the transfer rules that contain the nearest example are used to construct the target tree structure. The semantic distance between words is defined as the distance from the leaf node to the most specific common abstraction (MSCA) in a thesaurus (Ohno and Hamanishi, 1984).</Paragraph> <Paragraph position="2"> For example, if the input phrase &quot;ie (home) ni kaeru (return)&quot; is given, Rules 1 to 3 in Figure 3 are used for the syntactic transfer, and three target nodes are generated without any disambiguation.</Paragraph> <Paragraph position="3"> However, when we compare the source examples with the headword of the variables X(ie) and Y (kaeru), only Rule 2 is used for the transfer because the semantic distance of the example (soko (there), yuku (go)) is the nearest. In the current implementation, all rules that contain examples of the same distance are used.</Paragraph> <Paragraph position="4"> Consequently, example-based transfer achieves translation while considering case relations or idiomatic expressions based on the semantic distance from the source examples.</Paragraph> </Section> </Section> class="xml-element"></Paper>