File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/x93-1017_intro.xml
Size: 2,257 bytes
Last Modified: 2025-10-06 14:05:36
<?xml version="1.0" standalone="yes"?> <Paper uid="X93-1017"> <Title>Tokyo Marine & Fire 17th N Prt NP</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> BACKGROUND </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> The TIPSTER Data Extraction and Fifth Message Understanding </SectionTitle> <Paragraph position="0"> Conference (MUC-5) tasks focused on the process of dataextraction. This is a procedure in which pre-specified types of information are identified within free text, extracted, and inserted automatically within a template.</Paragraph> <Paragraph position="1"> Three TIPSTER contractors -- BBN, GE/CMU, NMSU/Brandeis -participated in the August '93 MUC-5 evaluation for both the English joint venture (EJV) and English microelectronics (EME) domains and their Japanese-language counterparts, the \]3V and 3ME applications. Two other contractors -- SRI and SRA -- participated in the EJV and 33V domains alone. CMU's Textract system took part in the 3apanese-language domains only. Of the five systems that tested in both English and Japanese, all but one scored higher in the Japanese-language applications according to both the summary error-based scores and reca11/precision-based metrics.</Paragraph> <Paragraph position="2"> This overall result has lead some participants and observers to suggest that Japanese is an &quot;easier&quot; language than English.</Paragraph> <Paragraph position="3"> Japanese-language usage in the total 1297-article \]3V corpus exhibits the same degree of ellipsis-generated vagueness and ambiguity as in other domains and genres of Japanese writing. On the other hand, however, in matters of information presentation JJV articles are very formulistic. This paper argues that the stereotypical structure of the topic sentence in the J3V corpus together with the &quot;default&quot; pattern of certain template fills gives the Japanese systems o ready basis for extracting information and inserting it into a template. The result is better overall systems' performance in 33V than EJV as indicated by the scoring metrics.</Paragraph> </Section> </Section> class="xml-element"></Paper>