File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/04/c04-1122_concl.xml
Size: 1,418 bytes
Last Modified: 2025-10-06 13:53:59
<?xml version="1.0" standalone="yes"?> <Paper uid="C04-1122"> <Title>Named Entity Discovery Using Comparable News Articles</Title> <Section position="6" start_page="0" end_page="0" type="concl"> <SectionTitle> 5 Conclusion and Future Work </SectionTitle> <Paragraph position="0"> In this paper we described a novel way to discover Named Entities by using the time series distribution hood of being a Named Entity (Multi-word). The horizontal axis shows the score of a word. The vertical axis shows the likelihood of being a NE. of names. Since Named Entities in comparable documents tend to appear synchronously, one can find a Named Entity by looking for a word whose chronological distribution is similar among several comparable documents. We conducted an experiment with several newspapers because news articles are generally sorted chronologically, and they are abundant in comparable documents. We confirmed that there is some correlation between the similarity of the time series distribution of a word and the likelihood of being a Named Entity.</Paragraph> <Paragraph position="1"> We think that the number of obtained Named Entities in our experiment was still not enough. So we expect that better performance in actual Named Entity tagging can be achieved by combining this feature with other contextual or lexical knowledge, mainly used in existing Named Entity taggers.</Paragraph> </Section> class="xml-element"></Paper>