File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/p06-1122_relat.xml

Size: 1,892 bytes

Last Modified: 2025-10-06 14:15:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1122">
  <Title>Modelling lexical redundancy for machine translation</Title>
  <Section position="8" start_page="974" end_page="975" type="relat">
    <SectionTitle>
6 Related Work
</SectionTitle>
    <Paragraph position="0"> Previous work on automatic bilingual word clustering has been motivated somewhat differently and not made use of cluster-based models to assign translation probabilities directly (Wang et al., 1996), (Och, 1998). There is, however, a large body of work using morphological analysis to define cluster-based translation models similar to ours but in a supervised manner (Zens and Ney, 2004), (Niessen and Ney, 2004). These approaches have used morphological annotation (e.g. lemmas and part of speech tags) to provide explicit supervision. They have also involved manually specifying which morphological distinc- null tions are redundant (Goldwater and McClosky, 2005). In contrast, we attempt to learn both equivalence classes and redundant relations automatically. Our experiments with orthographic features suggest that some morphological redundancies can be acquired in an unsupervised fashion. The marginal likelihood hard-clustering algorithm that we propose here for translation model selection can be viewed as a Bayesiank-means algorithm and is an application of Bayesian model selection techniques, e.g., (Wolpert, 1995). The Markov random field prior over model structure extends the fixed uniform prior over clusters implicit in k-means clustering and is common in computer vision (Geman and Geman, 1984). Recently Basu et al. (2004) used an MRF to embody hard constraints within semi-supervised clustering. In contrast, we use an iterative EM algorithm to learn soft constraints within the 'prior' monolingual space based on the results of clustering with bilingual statistics.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML