File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/96/x96-1050_intro.xml

Size: 2,485 bytes

Last Modified: 2025-10-06 14:06:10

<?xml version="1.0" standalone="yes"?>
<Paper uid="X96-1050">
  <Title>MULTILINGUAL ENTITY TASK (MET): JAPANESE RESULTS</Title>
  <Section position="6" start_page="450" end_page="451" type="intro">
    <SectionTitle>
ORGANIZATION
</SectionTitle>
    <Paragraph position="0"> Identifying and categorizing complex noun phrases in strings where there is no capitalization and whitespace make this type of expression the most difficult to process (group F-Measure average 73%). Normally, the corporate designator &amp;quot;~&amp;quot; (Co., Corp.) would assist in identifying an ORGANIZATION. However, the MET domain focused on political rather than commercial entities, so there were very few instances of this designator. And, although bureaucratic descriptors like &amp;quot;~'&amp;quot; indicate Japanese ministries, often well-known ministries such as &amp;quot;~l~_J~'&amp;quot; (MITI) are aliased (~l~_J~) without mention of the canonical form.</Paragraph>
    <Paragraph position="1">  In addition, the most prevalent entities properly identified as ORGANIZATION in these texts included groups, offices, labs, etc. -- that is, noun phrases which could be proper nouns depending upon context. For example, &amp;quot;~ ~SEJ~&amp;quot; could be the name of a particular factory, Miyata Factory, or a generic factory located in Miyata; similarly, &amp;quot;J~/i:~/i~&amp;quot; could be the New Hyogo Bank, the new (e.g., rebuilt) Hyogo Bank, or a new Hyogo Bank (i.e., one bank in the Hyogo Bank chain).</Paragraph>
    <Paragraph position="2"> To complicate matters further, once a complex NP like &amp;quot;~.~_j~.~.~/Jx~&amp;quot; (MITI Telecommunications Subcommittee) is determined to be a proper noun, the systems next were required to tag as ORGANIZATION each constituent part of the hierarchical relationship expressed within the phrase. In this case, there were two: MITI (parent) and Telecommunications Subcommittee (child).</Paragraph>
    <Paragraph position="3"> Summary The Japanese systems showed excellent overall results despite a very compressed development cycle.</Paragraph>
    <Paragraph position="4"> They handled comparatively easy types of expressions with a high -- &gt;90% -- degree of accuracy, and the hard expressions with surprising proficiency, thereby promising marked improvement in the near term and the capability to work in conjunction with other language processing technologies such as Machine Translation (MT) and text summadzarion \[5\].</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML