File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/05/h05-1065_relat.xml

Size: 1,799 bytes

Last Modified: 2025-10-06 14:15:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="H05-1065">
  <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 515-522, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Disambiguation of Morphological Structure using a PCFG</Title>
  <Section position="9" start_page="520" end_page="521" type="relat">
    <SectionTitle>
8 Related Work
</SectionTitle>
    <Paragraph position="0"> New methods are often first developed for English and later adapted to other languages. This might explain why morphological disambiguation has been  so rarely addressed in the past: English morphology is seldom ambiguous except for noun compounds. We are not aware of any work on the disambiguation of morphological analyses which is directly comparable to ours. Mark Lauer (1995) only considered English noun compounds and applied a different disambiguation strategy based on word association scores.</Paragraph>
    <Paragraph position="1"> Koehn and Knight (2003) proposed a splitting method for German compounds and showed that it improves statistical machine translation. Compounds are split into smaller pieces (which have to be words themselves) if the geometric mean of the word frequencies of the pieces is higher than the frequency of the compound. Information from a bilingualcorpusisusedtoimprovethesplittingaccuracy. null Andreas Eisele (unpublished work) implemented a statistical disambiguator for German based on weighted finite-state transducers as described in the introduction. However, his system fails to represent and disambiguate the ambiguities observed in compounds with three or more elements and similar constructions with structural ambiguities.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML