File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/h05-1046_concl.xml
Size: 1,406 bytes
Last Modified: 2025-10-06 13:54:33
<?xml version="1.0" standalone="yes"?> <Paper uid="H05-1046"> <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 363-370, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Disambiguating Toponyms in News</Title> <Section position="9" start_page="369" end_page="369" type="concl"> <SectionTitle> 7 Conclusion </SectionTitle> <Paragraph position="0"> This research provides a measure of the degree of of ambiguity with respect to a gazetteer for toponyms in news. It has developed a toponym disambiguator that, when trained on entirely machine annotated corpora that avail of easily available Internet gazetteers, disambiguates toponyms in a human-annotated corpus at 78.5% accuracy.</Paragraph> <Paragraph position="1"> Our current project includes integrating our disambiguator with other gazetteers and with a geovisualization system. We will also study the effect of other window sizes and the combination of this unsupervised approach with minimally-supervised approaches such as (Brill 1995) (Smith and Mann 2003). To help mitigate against data sparseness, we will cluster terms based on stemming and semantic similarity.</Paragraph> <Paragraph position="2"> The resources and tools developed here may be obtained freely by contacting the authors.</Paragraph> </Section> class="xml-element"></Paper>