File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/w03-1503_concl.xml
Size: 1,954 bytes
Last Modified: 2025-10-06 13:53:46
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-1503"> <Title>Construction and Analysis of Japanese-English Broadcast News Corpus with Named Entity Tags</Title> <Section position="5" start_page="8" end_page="8" type="concl"> <SectionTitle> 4 Conclusion </SectionTitle> <Paragraph position="0"> In this paper, in which we aimed to acquire NE translation knowledge, we described our construction of a Japanese-English broadcast news corpus with NE tags for NE translation-pair extraction. The tags represent NE characteristics and coreference information in a language and across languages. Analysis of the annotated 1,097 article pairs has shown that if NE occurrence information, such as classes, number of occurrences and occurrence order, is given for each language side, it may provide a good clue for determining NE correspondence across languages.</Paragraph> <Paragraph position="1"> Our future plans are listed below.</Paragraph> <Paragraph position="2"> * The problems in Section 2.5 need to be reexamined from the point of view of what information bilingual corpora should have for NE translation-pair extraction research.</Paragraph> <Paragraph position="3"> * The proposed analysis in Section 3 pointed out that identifying coreferences in a language is very important for achieving NE translation-pair extraction. Richer coreference information should be annotated in our corpus for coreference identification studies. We are planning to annotate coreference information for pronouns and some other non-NE expressions, referring to the MUC-7 coreference task definition (Hirschman and Chinchor, 1997).</Paragraph> <Paragraph position="4"> * Corpora with different characteristics, such as a bilingual newspaper corpus, will be annotated and analyzed.</Paragraph> <Paragraph position="5"> Acknowledgments This research was supported in part by the Telecommunications Advancement Organization of Japan.</Paragraph> </Section> class="xml-element"></Paper>