File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/c96-1094_intro.xml

Size: 2,617 bytes

Last Modified: 2025-10-06 14:05:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-1094">
  <Title>Translating into Free Word Order Languages</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Languages such as Catalan, Czech, Finnish, German, Hindi, Hungarian, Japanese, Polish, Russian, Turkish, etc. have much freer word order than English. For example, all six permutations of a transitive sentence are grammatical in Turkish (although SOV is the most common). When we translate an English text into a &amp;quot;free&amp;quot; word order language, we are faced with a choice between many different word orders that are all syntactically grammatical but are not all felicitous or contextually appropriate. In this paper, I discuss machine translation (MT) of English text into 2hrkish and concentrate on how to generate the appropriate word order in the target language based on contextual information.</Paragraph>
    <Paragraph position="1"> The most comprehensive project of this type is presented in (Stys/Zemke, 1995) for MT into Polish. They use the referential form and repeated mention of items in the English text in order to predict the salience of discourse entities and order the Polish sentence according to this salience ranking. They also rely on statistical data, choosing the most frequently used word orders. I argue for a more generative approach: a particular information structure (IS) can be determined from the contextual information and then can be used to generate the felicitous word order. This paper concentrates on how to determine the IS from contextual information using centering, old vs. new information, and contrastiveness. (HajifiovPS/etal, 1993; Steinberger, 1994) present approaches that determine the IS by using cues such as word order, definiteness, and complement semantic types (e.g.</Paragraph>
    <Paragraph position="2"> temporal adjuncts vs arguments) in the som:cc language, English. I believe that we cannot rely upon cues in the source language in order to determine the IS of the translated text. Instead, I use contextual informati&lt;)n in the target language to determine the IS of sentences in the target language. null In section 2, I discuss the Information Structure, and specifically th&lt;~ topic and the focus in naturally occurring Turkish data. Then, in section 3, I present algorithms for determining the topic and the focus, and show that we can generate contextually appropriate word orders in '\[~/rkish using these algorithms in a simple MT implementation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML