File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/h05-1046_concl.xml

Size: 1,406 bytes

Last Modified: 2025-10-06 13:54:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="H05-1046">
  <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 363-370, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Disambiguating Toponyms in News</Title>
  <Section position="9" start_page="369" end_page="369" type="concl">
    <SectionTitle>
7 Conclusion
</SectionTitle>
    <Paragraph position="0"> This research provides a measure of the degree of of ambiguity with respect to a gazetteer for toponyms in news. It has developed a toponym disambiguator that, when trained on entirely machine annotated corpora that avail of easily available Internet gazetteers, disambiguates toponyms in a human-annotated corpus at 78.5% accuracy.</Paragraph>
    <Paragraph position="1"> Our current project includes integrating our disambiguator with other gazetteers and with a geovisualization system. We will also study the effect of other window sizes and the combination of this unsupervised approach with minimally-supervised approaches such as (Brill 1995) (Smith and Mann 2003). To help mitigate against data sparseness, we will cluster terms based on stemming and semantic similarity.</Paragraph>
    <Paragraph position="2"> The resources and tools developed here may be obtained freely by contacting the authors.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML