XML Viewer - c96-2129

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/96/c96-2129_evalu.xml
Size: 8,923 bytes
Last Modified: 2025-10-06 14:00:19
<?xml version="1.0" standalone="yes"?>
<Paper uid="C96-2129">
  <Title>Automatic Detection of Omissions in Translations</Title>
  <Section position="10" start_page="766" end_page="768" type="evalu">
    <SectionTitle>
8 Evaluation
</SectionTitle>
    <Paragraph position="0"> '\[b accurately evaluate a system for detec(ing omissions in tra,nslations, it is uecessary to use a lfitext with ma,ny omissions, whose locatio.s are known in advance. For perfect validity, the omissions should be those of a real translat, or, working ou a real translation, detected by a per fc('t i)roof-rcader. \[\]nfortunately, first drafts of tra.sh,,io.s that had bee,, subj.d, ed to (:ar,~r.l ,',~ vision were not readily available. Therefore, the ewdual,ion proee.eded by simulation. The adva, ll rage of a simulation was complete control oww the lengths and relative positions of omissions. This is important because the noise in a bitext map is mort likely 1,o obscure a short otnissio, dlan a long one.</Paragraph>
    <Paragraph position="1">  The simulated omissions' lengths were chosen to represent the lengths of typical sentences and paragraphs in real texts. A corpus of 61479 Le Monde paragraphs yielded a median French paragraph length of 553 characters. 1 had no corpus of French sentences, so I estimated the median French sentence length less directly. A corpus of 43747 Wall ,5'trent Jo,~rnal sentences yielded a median English sentence length of 126 characters.</Paragraph>
    <Paragraph position="2"> This number was multiplied by 1.103, the ratio of text lengths in the &amp;quot;easy&amp;quot; bitext, to yield a median French sentence length of 139. Of course, the lengths of sentences and paragraphs in other text genres will vary. The nledian lengths of sentences and paragraphs in this paper are 114 and 730 characters, respectively. Longer omissions arc easier to detect.</Paragraph>
    <Paragraph position="3"> The placement of silnulated omissions in the text was governed by the assumption that translators' errors of omission occur independently fl:oIn one another. This assumption implied that it was reasonable to scatter the simulated omissions in the text using any meinoryless distribution.</Paragraph>
    <Paragraph position="4"> Such a distribution simplitied the experimental design, because performance on a fixed number of omissions in one text would be the same as perrefinance on the same number of omissions scattered among multiple texts. As a result, the bitext mapping algorithm had to be run only once per parameter set, instead of separately for each of the 100 omissions in that parameter set.</Paragraph>
    <Paragraph position="5"> A useflll evaluation of any omission detection algorithm must take. the human factor into account. A translator is unlikely to slog through a long series of false omissions to make sure thai; there are no more true omissions in the translation. Several consecutive false onfissions will deter the translator from searching any further. On average, the more consecutive fMse omissions it takes for a translator to give up, the more true omissions they will tind. Thus, recall is highly correlated with the amount of patience that a translator has. Translator patience is one of the independent w~riables in this experiment, quantified in terms of the nmnber of consecutive false omissions that the translator will tolerate.</Paragraph>
    <Paragraph position="6"> Separate evMuations were carried out for the Basic Method and for AI)OMIT, and each method was evMuated separately on the two different omission lengths. The 2x2 design necessitated ibm: repetitions of the following steps: 1. 100 segments of the given length were deleted from the 1,Y=eneh hMf of the bitext. 'Phe position of each simulated omission was randomly generated fl:om a unilbrm distribution, except that, to simplify subsequent evaluation, the omissions were spaced at least 1000 characters apart.</Paragraph>
    <Paragraph position="7"> 2. A hand-constructed bitext map was used to tlnd the segments in the English half of the bitext that corresponded to the deleted French segments. For the purposes of the simulation, these English segments served as the &amp;quot;true&amp;quot; omitted segments.</Paragraph>
    <Paragraph position="8"> 3. The SIMI{. bitext mapping algorithm (Melamed, 1996) was used to find a map between the original English text and the French text; containing the simulated omissions. Note that SIMI{ cnn be used with or without a translation lexicon. Use of a translation lexicon results in more accurate bitext maps, which make omission detection easier. However, wide-coverage translation lexicons are rarely awfilable. +tb make the evMuation more representative, SIMR was run without this resource.</Paragraph>
    <Paragraph position="9"> 4. The bitext map resulting froln Step 3 was fed into the Basic Method for detecting omissions. The omitted segments flagged by the Basic Method were sorted in order of decreasing length.</Paragraph>
    <Paragraph position="10"> 5. Each omitted segment in the output from Step 4 was compared to the list of true omitted segments from Step 2. If any of the true omitted segments overlapped the flagged omitted segment, the &amp;quot;true omissions&amp;quot; counter was incremented. Otherwise, the &amp;quot;false omissions&amp;quot; counter was incremented. An example of the resulting pattern of increments is shown in Figure 4.</Paragraph>
    <Paragraph position="11"> 6. The pattern of increments was further analyzed to find the first point at which the &amp;quot;\['a\]se omissions&amp;quot; counter was incremented 3 times in a row. The wflue of the &amp;quot;true on;fissions&amp;quot; counter at that point represented the recall achieved by translators who give up after 3 consecutive false omissions. To measure the recall that would be achieved by more patient translators, the &amp;quot;true omissions&amp;quot; counter was also recorded at the first occurrence of 4 and 5 consecutive false omissions.</Paragraph>
    <Paragraph position="12"> 7. Steps 1 to 6 were repeated 10 times, in order to measure 95% confidence intervMs.</Paragraph>
    <Paragraph position="13"> The low slope angle thresholds used in Section 4 are suboptimal in the presence of lna 1) noise, because much of the noise results in segments of very low slope. The optimum value t -- 37 o was determined using a separate development bitext. With t frozen at the optimum value, recMl was measured on the corrected &amp;quot;easy&amp;quot; bitext.</Paragraph>
    <Paragraph position="14"> Figures 5 and 6 plot the mean recall scores R)r translators with different degrees of patience. A I)OM\]T outperformed the Basic Method by up to 48 percentage points. AI)OMIT is also more robust, as indicated by its shorter confidence intervals. Figure 6 shows that ADOMIT can hel l ) translators catch more thall 90% of all paragraphsize olnissions, and more than one half of all sentence-size onfissions.</Paragraph>
    <Paragraph position="15">  lo consecutive &amp;quot;false&amp;quot; omissions. In Ibis &lt;cample, lhe firsl run of more than. 3 &amp;quot;faLse&amp;quot; omissions occurs only after 87 &amp;quot;true&amp;quot; omissions.</Paragraph>
    <Paragraph position="16">  consecutive false emissions tolerated by translator l?igtlre 5: Mean Basic M elhod recall scor'cs with 950X confidence intervals fin' simulaled translators with varying degrees of patience.</Paragraph>
    <Paragraph position="17"> AD()MH' is only limited by the quality of the input bitext map. 'l'he severity of this limits-Lion is yet t;o be det;ermined. This paper evalu-a~,od AI)OM1T on a pair of buig,tages for which SIMR (;nil reliably produce good bitext maps (Melamed, 1996). SIMR will soon be tested on other language pairs. ADOMIT will become e.ve.n more useful as better bitext nml)ping technology becon\]es available.</Paragraph>
  </Section>
  <Section position="11" start_page="768" end_page="768" type="evalu">
    <SectionTitle>
9 Conclusion
AI)OMIT is the first pul)lished aul, oin~(,ic
</SectionTitle>
    <Paragraph position="0"> method for detecting omissions in translations.</Paragraph>
    <Paragraph position="1"> A I)OMIT's performance is limited only by the accuracy of the input bitcxt real). Given an accurate bitc'xt map, AI)OM IT can reliably dcte('l; even tim smallest errors of omission. Even with today's poor bitext mapping technology, ADOMIT lit,(Is a.</Paragraph>
    <Paragraph position="2"> large enough proportion of typical omissions to be of great practicaJ benefit. The t,e(:hnique is easy to implement and easy to integrate into a transla- null consecutive false omissions tolerated by translator l,'igure 6: Mean A I)OMIT' recall scores with 95~ confidence intervals for simuhttcd lranslalors wilh varying degrees of palicncc.</Paragraph>
    <Paragraph position="3"> tor's routine. AI)OMIT is a valuable qu,dity control tool for tra.nslators and translation bureatts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML