File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-0903_intro.xml

Size: 2,110 bytes

Last Modified: 2025-10-06 14:03:14

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-0903">
  <Title>Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pages 17-24, Ann Arbor, June 2005. c(c)2005 Association for Computational Linguistics Preprocessing and Normalization for Automatic Evaluation of Machine Translation</Title>
  <Section position="3" start_page="0" end_page="17" type="intro">
    <SectionTitle>
2 Automatic evaluation measures
</SectionTitle>
    <Paragraph position="0"> The majority of MT evaluation approaches are based on the distance or similarity of MT candidate output to a set of reference translations, i.e. to sentences which are known to be correct. The lower this distance is, or the higher the similarity, the better the  candidate translations are considered to be, and thus the better the MT system.</Paragraph>
    <Section position="1" start_page="17" end_page="17" type="sub_section">
      <SectionTitle>
2.1 Evaluation measures studied
</SectionTitle>
      <Paragraph position="0"> Out of the vast amount of measures, we will focus on the following measures that are widely used in research and in evaluation campaigns: WER, PER, BLEU, and NIST.</Paragraph>
      <Paragraph position="1"> Let a test set consist of k = 1,...,K candidate sentences Ek generated by an MT system. For each candidate sentence Ek, we have a set of r = 1,...,Rk reference sentences tildewideEr,k. Let Ik denote the length, and I[?]k the reference length for each sentence Ek. We will explain in section 3.3 how the reference length is calculated.</Paragraph>
      <Paragraph position="2"> With this, we write the total candidate length over the corpus as -I := summationtextk Ik, and the total reference length as -I[?] := summationtextk I[?]k.</Paragraph>
      <Paragraph position="3"> Let nem1 ,k denote the count of the m-gram em1 within the candidate sentence Ek; similarly let ~nem1 ,r,k denote the same count within the reference sentence tildewideEr,k. The total m-gram count over the corpus is then -nm := summationdisplay k summationdisplay</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML