File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/03/w03-0106_abstr.xml

Size: 1,292 bytes

Last Modified: 2025-10-06 13:42:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0106">
  <Title>InfoXtract location normalization: a hybrid approach to geographic references in information extraction [?]</Title>
  <Section position="2" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> Ambiguity is very high for location names. For example, there are 23 cities named 'Buffalo' in the U.S. Based on our previous work, this paper presents a refined hybrid approach to geographic references using our information extraction engine InfoXtract.</Paragraph>
    <Paragraph position="1"> The InfoXtract location normalization module consists of local pattern matching and discourse co-occurrence analysis as well as default senses.</Paragraph>
    <Paragraph position="2"> Multiple knowledge sources are used in a number of ways: (i) pattern matching driven by local context, (ii) maximum spanning tree search for discourse analysis, and (iii) applying default sense heuristics and extracting default senses from the web. The results are benchmarked with 96% accuracy on our test collections that consist of both news articles and tourist guides. The performance contribution for each component of the module is also benchmarked and discussed.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML