XML Viewer - p97-1063

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/p97-1063_concl.xml
Size: 2,067 bytes
Last Modified: 2025-10-06 13:57:51
<?xml version="1.0" standalone="yes"?>
<Paper uid="P97-1063">
  <Title>A Word-to-Word Model of Translational Equivalence</Title>
  <Section position="7" start_page="495" end_page="495" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> Many multilingual NLP applications need to translate words between different languages, but cannot afford the computational expense of modeling the full range of translation phenomena. For these applications, we have designed a fast algorithm for estimating word-to-word models of translational equivalence. The estimation method uses a pair of hidden parameters to measure the model's uncertainty, and avoids making decisions that it's not likely to make correctly. The hidden parameters can be conditioned on information extrinsic to the model, providing an easy way to integrate pre-existing knowledge. null So far we have only implemented a two-class model, to exploit the differences in translation consistency between content words and function words.</Paragraph>
    <Paragraph position="1"> This relatively simple two-class model linked word tokens in parallel texts as accurately as other translation models in the literature, despite being trained on only one fifth as much data. Unlike other translation models, the word-to-word model can automatically produce dictionary-sized translation lexicons, and it can do so with over 99% accuracy.</Paragraph>
    <Paragraph position="2"> Even better accuracy can be achieved with a more fine-grained link class structure. Promising features for classification include part of speech, frequency of co-occurrence, relative word position, and translational entropy (Melamed, 1997). Another interesting extension is to broaden the definition of a &amp;quot;word&amp;quot; to include multi-word lexical units (Smadja, 1992). If such units can be identified a priori, their translations can be estimated without modifying the word-to-word model. In this manner, the model can account for a wider range of translation phenomena.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML