File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2162_intro.xml

Size: 1,927 bytes

Last Modified: 2025-10-06 14:06:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2162">
  <Title>Improving Statistical Natural Language Translation with Categories and Rules</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In SNLT the transfer itself is realized as a maximization process of the form</Paragraph>
    <Paragraph position="2"> Here d is a given source language (SL) sentence which has to be translated into a target language (TL) sentence e. In order to model the distributions P(e\[d) all approaches in SNLT use a &amp;quot;divide and conquer&amp;quot; strategy of approximating P(e\[d) by a combination of simpler models. The problem is to reduce parameters in a sufficient way but end up with a model still able to describe the linguistic facts of natural language translation.</Paragraph>
    <Paragraph position="3"> The work presented here uses two approximations for P(e\[d). One approximation is used for to gain the relevant parameters in training while a modified formula is subject of decoding translations. In detail, we impose the following modifications with respect to approaches published in the last decade: 1. A refined distance weight for the STL probabilities is used which allows for a good modeling of the effects caused by syntactic phrases. 2. In order to account for collocations a WA technique is used, where oneto-n and n-to-one WAs are allowed. 3. For the translation WCs are used which are constructed using clustering techniques, where the STL forms a part of the optimization criterion.</Paragraph>
    <Paragraph position="4"> 4. A set of TRs is learned mapping sequences of SL WCs to sequences of TL WCs.</Paragraph>
    <Paragraph position="5"> Throughout the paper the four topics above are described in more detail. Finally we report on experimental results produced on the VERBMOBIL corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML