File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/94/a94-1005_intro.xml

Size: 6,552 bytes

Last Modified: 2025-10-06 14:05:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="A94-1005">
  <Title>Machine Translation of Sentences with Fixed Expressions</Title>
  <Section position="3" start_page="0" end_page="28" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> We are developing an English-to-Japanese machine translation (MT) system to produce real-time rough translations for Associated Press (AP) wire service news stories. With some news topics, troubles with fixed expressions lows the translation accuracy of the MT system. Economic news stories in particular are difficult to translate by conventional rule-based methods, because they contain many fixed expressions sharing two major characteristics: cl) The fixed expressions produce economics-specific syntactic structure.</Paragraph>
    <Paragraph position="1"> c2) Equivalents of the fixed expressions require Japanese economic jargons.</Paragraph>
    <Paragraph position="2"> These characteristics respectively cause two major bottlenecks for the conventional rule-based MT system: bl) General-purpose grammatical rules are not sufficient to yield correct analysis of economic news stories. (Simple addition of grammatical rules increases syntactic ambiguities.) b2) It is difficult to select the appropriate Japanese words for the translation.</Paragraph>
    <Paragraph position="3"> Actually, these problems reduce the translation accuracy of our rule-based MT system to only 20%, which is too low for practical use.</Paragraph>
    <Paragraph position="4"> This paper presents a new English-to-Japanese MT system for economic news stories, which is called ENTS (Economic News stories machine Translation System), to process fixed expressions effectively. ENTS consists of three sequential processes (as shown in Fig. 1), based on the three basic types of economic news sentence. Process 1 is a kind of example-based approach, while Processes 2 and 3 are rule-based ones that differ in grammatical rules. This paper focuses mainly on Process 1, which is composed of fixed sentence translation, compound word translation, fixed sentence translation data production and fixed sentence extraction, Fixed sentence translation data (STRA data), which is a kind of bilingual template, plays a key role in the fixed sentence translation. The STRA data is built automatically from fixed English sentences extracted from a large corpus and their corresponding Japanese translations.</Paragraph>
    <Paragraph position="5"> input  Recently, several example-based MTs were proposed for processing fixed expressions \[Nagao841\[Sumita91 \]. Furuse proposed a cooperative method using tightly woven combination of example- and rule-based approaches \[Furuse92\]. In contrast to their approach, we use the two methods independently. Therefore, the translation accuracy of our example-based method is guaranteed to be 100%.</Paragraph>
    <Paragraph position="6"> Creating an example-based MT requires bilingual translation data. Kaji proposed acquiring the bilingual translation data from bilingual texts \[Kaji92\]. However that would require a complete syntactic analysis of bilingual texts. Our method is more robust, because it requires only a partial analysis.</Paragraph>
    <Paragraph position="7"> Section 2 describes some relevant features of economic news stories. In Section 3, we present an overview of ENTS. The following sections describe the fixed sentence translation method in ENTS, and the results of experiments using ENTS for AP economic news stories.</Paragraph>
    <Paragraph position="8"> 2 Features of economic news sentences The AP delivers about 350 wire-service news stories a day, of which about 50 are concerned with economics.</Paragraph>
    <Paragraph position="9"> Each news story has its own title related to the contents. Because the titles on economic news stories are fixed, such stories can be selected easily. Most sentences in these economic news stories have fixed expressions comprised of compound words and/or collocations.</Paragraph>
    <Paragraph position="10"> Example of fixed expressions el) compound words &amp;quot;5 cents&amp;quot;, &amp;quot;17.76 dollars per kilo&amp;quot;, and &amp;quot;The U.S. dollar&amp;quot; e2) collocations &amp;quot;Malaysian tin closed at&amp;quot;, &amp;quot;The U.S. dollar opened&amp;quot;, and &amp;quot;as share prices rose&amp;quot; Based on the fixed expressions, the sentences in economic news stories, called economic sentences, are classified into three types:  2-1) &amp;quot;The U.S. dollar opened slightly higher against the Japanese yen Tuesday morning in Tokyo, while share prices inched up.&amp;quot; 2-2) &amp;quot;The U.S. dollar drifted lower against the Japanese yen Wednesday morning, while share prices on the Tokyo Stock Exchange rose sharply.&amp;quot; 2-3) &amp;quot;The U.S. dollar opened higher against the Japanese yen in Tokyo Thursday, as share prices rose in early trading.&amp;quot; null Type III : General sentences Example 3 3-1 ) &amp;quot;Kagawa added, however, that the market still anticipates a rising dollar.&amp;quot; 3-2) &amp;quot;Shigeru Sato, an analyst with Sanyo Securities, said the index fell some 65 points at one point in the afternoon, but last-minute arbitrage buying pulled it back up.&amp;quot; 3-3) &amp;quot;But Tobo said the market's basic sentiment remained bearish because of a lack of incentives to focus on.&amp;quot; (1) Type I  The sentences in Type I contain fixed expressions in which the words change a little form day to day. The parts of speech of the translation equivalents of these fixed expressions are nouns in Japanese. For example, the translation equivalents of compound words like &amp;quot;17.76 dollars per kilo&amp;quot; are nouns, as are those of &amp;quot;up&amp;quot; and &amp;quot;down&amp;quot;in 1-1, 2, 3. The verb in a Type I sentence, such as &amp;quot;close&amp;quot; in examples 1-1, 2, 3, is fixed.</Paragraph>
    <Paragraph position="11"> (2) Type II Although each sentence in Type II has a unique style with fixed expressions, there is a greater variety of fixed expressions than that in Type I sentences. For example: &amp;quot;opened&amp;quot; and &amp;quot;drifted&amp;quot; or &amp;quot;slightly higher&amp;quot;, &amp;quot;lower&amp;quot; and &amp;quot;higher&amp;quot; The parts of speech of their translation equivalents of these fixed expressions are verb or adjective in Japanese. Therefore, their translation equivalences require a production method of their inflections in Japanese generation process of MT.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML