File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/06/w06-2403_abstr.xml

Size: 1,468 bytes

Last Modified: 2025-10-06 13:45:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2403">
  <Title>Automatic Extraction of Chinese Multiword Expressions with a Statistical Tool</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper, we report on our experiment to extract Chinese multiword expressions from corpus resources as part of a larger research effort to improve a machine translation (MT) system. For existing MT systems, the issue of multi-word expression (MWE) identification and accurate interpretation from source to target language remains an unsolved problem. Our initial test on the Chinese-to-English translation functions of Systran and CCID's Huan-Yu-Tong MT systems reveal that, where MWEs are involved, MT tools suffer in terms of both comprehensibility and adequacy of the translated texts. For MT systems to become of further practical use, they need to be enhanced with MWE processing capability. As part of our study towards this goal, we test and evaluate a statistical tool, which was developed for English, for identifying and extracting Chinese MWEs. In our evaluation, the tool achieved precisions ranging from 61.16% to 93.96% for different types of MWEs.</Paragraph>
    <Paragraph position="1"> Such results demonstrate that it is feasible to automatically identify many Chinese MWEs using our tool, although it needs further improvement.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML