File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/h05-1065_concl.xml
Size: 1,428 bytes
Last Modified: 2025-10-06 13:54:33
<?xml version="1.0" standalone="yes"?> <Paper uid="H05-1065"> <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 515-522, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Disambiguation of Morphological Structure using a PCFG</Title> <Section position="10" start_page="521" end_page="521" type="concl"> <SectionTitle> 9 Summary </SectionTitle> <Paragraph position="0"> We presented a disambiguation method for German morphological analyses which is based on a head-lexicalized probabilistic context-free grammar. The words are split into morpheme lattices by a finite state morphology, and then parsed with the probabilistic context-free grammar. The grammar was trained on unlabeled data using the Inside-Outside algorithm and evaluated on 807 manually disambiguated analyses of infrequent words. Different training strategies have been compared. A combination of one iteration of unlexicalized training and four iterations of lexicalized training returned the best results with over 68% exact match accuracy, compared to a baseline of 45% which was obtained by randomly choosing one of the minimally complex analyses. Without lexicalization, the accuracy dropped by 15 percentage points, indicating that lexicalization is essential for morphological disambiguation. null</Paragraph> </Section> class="xml-element"></Paper>