File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/n06-3003_intro.xml

Size: 3,849 bytes

Last Modified: 2025-10-06 14:03:29

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-3003">
  <Title>Can the Internet help improve Machine Translation?</Title>
  <Section position="3" start_page="0" end_page="219" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Achieving high translation quality remains the biggest challenge Machine Translation (MT) systems currently face. Researchers have explored a variety of methods to include user feedback in the MT loop. Similar to our approach, Phaholphinyo and colleagues (2005) proposed adding post-editing rules to their English-Thai MT system with the use of a post-editing tool. However, they use context sensitive pattern-matching rules, which make it impossible to fix errors involving missing words.</Paragraph>
    <Paragraph position="1"> Unlike our approach, in their system, the rules are created by experienced linguists and their approach requires a large corpus. They mention an experiment with 6,000 bilingual sentences but report no results due to data sparseness.</Paragraph>
    <Paragraph position="2"> In general, most MT systems have failed to incorporate post-editing efforts beyond the addition of corrected translations to the parallel training data for SMT and EBMT or to a translation memory database.</Paragraph>
    <Paragraph position="3">  Therefore, a largely automated method that uses online post-editing information to automatically improve translation rules constitutes a great advance in the field.</Paragraph>
    <Paragraph position="4"> If an MT-produced translation is incorrect, a bi-lingual speaker can diagnose the presence of an error reliably using the online Translation Correction Tool (Font Llitjos and Carbonell, 2004). An example of an English-Spanish sentence pair generated by our MT system is &amp;quot;Gaudi was a great artist - Gaudi era un artista grande&amp;quot;. Using the online tool, bilingual speakers modified the incorrect translation to obtain a correct one: &amp;quot;Gaudi era un gran artista&amp;quot;.</Paragraph>
    <Paragraph position="5"> Bilingual speakers, however, cannot be expected to diagnose which complex translation rules produced the error, and even less, determine how to improve those rules. One of the main goals of this research is to automate the Rule Refinement process based on just error-locus and possibly some error-type information from the bilingual speaker, relying on rule blame assignment and on regression testing to evaluate and measure the consequent improvement in MT accuracy. In this case, our Automatic Rule Refinement system can add the missing sense to the lexicon (greatgran) as  For a more detailed discussion, see Font Llitjos and colleagues (2005a)  well as the special case rule for Spanish pre-nominal adjectives to the grammar.</Paragraph>
    <Paragraph position="6"> With this system in place, we envision a modified version of the Translation Correction Tool as a game with a purpose, available online through a major web portal. This would allow bilingual speakers to correct MT input and get rewards for making good corrections, and compare their scores and speed with other users. For the MT community this means having a free and easy way to get MT output feedback and potentially improve their systems based on such feedback. Furthermore, a fully interactive system would be a great opportunity to show users that their corrections have a visible impact on technology, since they would see the effects their corrections have on other sentences. Last but not least, this new method is also expected to be particularly useful in resource-poor scenarios, such as the ones the Avenue project is devoted to (Font Llitjos et al., 2005b), where statistical systems are not an option and where there might be no experts with knowledge of the resource-poor language (Figure 1).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML