File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1080_concl.xml

Size: 1,919 bytes

Last Modified: 2025-10-06 13:53:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="C04-1080">
  <Title>Part of Speech Tagging in Context</Title>
  <Section position="8" start_page="2" end_page="2" type="concl">
    <SectionTitle>
7 Conclusions
</SectionTitle>
    <Paragraph position="0"> We have presented a comprehensive evaluation of several methods for unsupervised part-of-speech tagging, comparing several variations of hidden Markov model taggers and unsupervised transformation-based learning using the same corpus and same lexicons. We discovered that the  quality of the lexicon made available to unsupervised learner made the greatest difference to tagging accuracy. Filtering the possible part-of-speech assignments contained in a basic lexicon automatically constructed from the commonly-used Penn Treebank improved results by as much as 22%. This finding highlights the importance of the need for clean dictionaries whether they are constructed by hand or automatically when we seek to be fully unsupervised.</Paragraph>
    <Paragraph position="1"> In addition, we presented a variation on HMM model training in which the tag sequence and lexical probabilities are estimated in sequence. This helped stabilize training when estimation of lexical probabilities can be noisy.</Paragraph>
    <Paragraph position="2"> Finally, we experimented with using left and right context in the estimation of lexical probabilities, which we refer to as a contextualized HMM. Without supervision, this new HMM structure improved results slightly compared to a simple trigram tagger as described in Merialdo, which takes into account only the current tag in predicting the lexical item. With supervision, this model achieves state of the art results without the lengthy training procedure involved in other high-performing models. In the future, we will consider making an increase the context-size, which helped Toutanova et al. (2003).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML