File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-2251_concl.xml

Size: 1,301 bytes

Last Modified: 2025-10-06 13:58:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-2251">
  <Title>Predicting Part-of-Speech Information about Unknown Words using Statistical Methods</Title>
  <Section position="7" start_page="1506" end_page="1506" type="concl">
    <SectionTitle>
6 Conclusions and Further Work
</SectionTitle>
    <Paragraph position="0"> The experiments documented in this paper suggest that a tagger can be trained to handle unknown words effectively. By using the probabilistic lexicon, we can predict tags for unknown words based on probabilities estimated from training data, not hand-crafted rules. The modular approach to unknown word prediction allows us to determine what sorts of information are most important.</Paragraph>
    <Paragraph position="1"> Further work will attempt to improve the accuracy of the predictor, using new knowledge sources. We will explore the use of the concept of a confidence measure, as well as using only infrequently occurring words from the lexicon to train the predictor, which would presumably offer a better approximation of the distribution of an unknown word. We also plan to integrate the predictor into a full HMM tagging system, where it can be tested in real-world applications, using the hidden Markov model to disambiguate problem words.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML