File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/95/p95-1050_evalu.xml

Size: 1,709 bytes

Last Modified: 2025-10-06 14:00:23

<?xml version="1.0" standalone="yes"?>
<Paper uid="P95-1050">
  <Title>Identifying Word Translations in Non-Parallel Texts</Title>
  <Section position="5" start_page="320" end_page="321" type="evalu">
    <SectionTitle>
4 Results
</SectionTitle>
    <Paragraph position="0"> The simulation was conducted by randomly permuting the word order of the German matrix and then computing the similarity s to the English matrix.</Paragraph>
    <Paragraph position="1"> For each permutation it was determined how many words c had been shifted to positions different from those in the original German matrix. The simulation was continued until for each value of c a set of 1000 similarity values was available. 8 Figure 1 shows for the three formulas how the average similarity J between the English and the German matrix depends on the number of non-corresponding word positions c. Each of the curves increases monotonically, with formula 1 having the steepest, i. e. best discriminating characteristic. The dotted curves in figure 1 are the minimum and maximum values in each set of 1000 similarity values for formula 1.</Paragraph>
    <Paragraph position="2"> X The logarithm has been removed from the mutual information measure since it is not defined for zero cooccurrences. null =Normalization was conducted in such a way that the suxn of all matrix entries adds up to the number of fields in the matrix.</Paragraph>
    <Paragraph position="3"> Sc ---- 1 is not possible and was not taken into account.  of the English and the German matrix and the number of non-corresponding word positions c for 3 formulas. The dotted lines are the minimum and maximum values of each sample of 1000 for formula 1.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML