XML Viewer - p06-2088

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-2088_evalu.xml
Size: 6,576 bytes
Last Modified: 2025-10-06 13:59:44
<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-2088">
  <Title>Simultaneous English-Japanese Spoken Language Translation Based on Incremental Dependency Parsing and Transfer</Title>
  <Section position="7" start_page="686" end_page="687" type="evalu">
    <SectionTitle>
5 Experiment
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="686" end_page="686" type="sub_section">
      <SectionTitle>
5.1 Outline of Experiment
</SectionTitle>
      <Paragraph position="0"> To evaluate our method, we conducted a translation experiment was made as follows. We implemented the system in Java language on a 1.0-GHz PentiumM PC with 512 MB of RAM. The OS was Windows XP. The experiment used all 578 sentences in the ATIS corpus with a parse tree, in the Penn Treebank (Marcus et al. 1993). In addition, we used 533 syntax rules, which were extracted from the corpus' parse tree. The position of the head child in the grammatical rule was defined according to Collins' method (Collins, 1999).</Paragraph>
    </Section>
    <Section position="2" start_page="686" end_page="687" type="sub_section">
      <SectionTitle>
5.2 Evaluation Metric
</SectionTitle>
      <Paragraph position="0"> Since an incremental translation system for spoken dialogues is required to realize a quick and informative response to support smooth communication, we evaluated the translation results of our system in terms of both simultaneity and quality.</Paragraph>
      <Paragraph position="1"> To evaluate the translation quality of our system, each translation result of our system was assigned one of four ranks for translation quality by a human translator: A (Perfect): no problems in either information or grammar B (Fair): easy to understand but some important information is missing or it is grammatically flawed C (Acceptable): broken but understandable with effort D (Nonsense): important information has been translated incorrectly To evaluate the simultaneity of our system, we calculated the average delay time for translating chunks using the following expression: Average delay time =</Paragraph>
      <Paragraph position="3"> is the virtual elapsed time from inputting the kth chunk until outputting its translated chunk.</Paragraph>
      <Paragraph position="4"> (When a repetition is used, d k is the elapsed time from inputting the kth chunk until restate its translated chunk.) The virtual elapsed time increases by one unit of time whenever a chunk is input, n is the total number of chunks in all of the test sentences. null The average delay time is effective for evaluating the simultaneity of translation. However, it is difficult to evaluate whether our system actually improves the efficiency of a conversation. To do so, we measured &amp;quot;the speaker' and the interpreter's utterance time.&amp;quot; &amp;quot;The speaker' and the interpreter 'utterance time&amp;quot; runs from the start time of a speaker's utterance to the end time of its translation. We cannot actually measure actual &amp;quot;the</Paragraph>
      <Paragraph position="6"> terance time and the time from the end time of the speaker's utterance to the end time of the translation null speaker' and the interpreter' utterance time&amp;quot; because our system does not include speech recognition and synthesis. Thus, the processing time of speech recognition and transfer text-to-speech synthesis is zero, and the speaker's utterance time and the interpreter's utterance time is calculated virtually by assuming that the speaker's and interpreter's utterance speed is 125 ms per mora.</Paragraph>
    </Section>
    <Section position="3" start_page="687" end_page="687" type="sub_section">
      <SectionTitle>
5.3 Experiment Results
</SectionTitle>
      <Paragraph position="0"> To evaluate the translation quality and simultaneity of our system, we compared the translation results of our method (Y) with two other methods.</Paragraph>
      <Paragraph position="1"> One method (X) translates the input chunks with no delay time. The other method (Z) translates the input chunks by waiting for the whole sentence to be input, in as consecutive translation. We could not evaluate the translation quality of the method Z because we have not implemented the method Z.</Paragraph>
      <Paragraph position="2"> And we virtually compute the delay time and the utterance time. Table 1 shows the estimation results of methods X, Y and Z. Note, however, that we virtually calculated the average delay time and the speaker's and interpreter's utterance times in method Z without translating the input sentence.</Paragraph>
      <Paragraph position="3"> Table 1 indicates that our method Y achieved a 55.6% improvement over method X in terms of translation quality and a 1.0 improvement over method Z for the average delay time.</Paragraph>
      <Paragraph position="4"> Figure 8 shows the relation between the speaker's utterance time and the time from the end time of the speaker's utterance to the end time of the translation. According to Fig. 8, the longer a speaker speaks, the more the system reduces the time from the end time of the speaker's utterance to the end time of the translation.</Paragraph>
      <Paragraph position="5"> In Section 3, we explained the constant R.Table 2 shows increases in R from 0 to 4, with the results of the estimation of quality, the average delay time, the number of inverted sentences and the number of sentences with restatement. When we set the constant to R =2, the average delay time improved by a 0.08 over that of method Y, and the translation quality did not decrease remarkably. Note, however, that method Y did not utilize any predicate inversions.</Paragraph>
      <Paragraph position="6"> To ascertain the problem with our method, we investigated 165 sentences whose translations were assigned the level D when the system translated them by utilizing dependency constraints.</Paragraph>
      <Paragraph position="7"> According to the investigation, the system generated grammatically incorrect sentences in the following cases: * There is an interrogative word (e.g. &amp;quot;what&amp;quot;| &amp;quot;which&amp;quot;) in the English sentence (64 sentences). null * There are two or more predicates in the English sentence (25 sentences).</Paragraph>
      <Paragraph position="8"> * There is a coordinate conjunction (e.g.</Paragraph>
      <Paragraph position="9"> &amp;quot;and&amp;quot;|&amp;quot;or&amp;quot;) in the English sentence (21 sentences). null Other cases of decreases in the translation quality occurred when a English sentence was ill-formed or when the system fails to parse.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML