File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/05/i05-4006_evalu.xml

Size: 3,313 bytes

Last Modified: 2025-10-06 13:59:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="I05-4006">
  <Title>Construction of Structurally Annotated Spoken Dialogue Corpus</Title>
  <Section position="6" start_page="44" end_page="45" type="evalu">
    <SectionTitle>
5 Evaluation of Structurally Annotated
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="44" end_page="45" type="sub_section">
      <SectionTitle>
Dialogue Corpus
</SectionTitle>
      <Paragraph position="0"> To evaluate the scalability of the corpus for creating dialogue-structural rules, a dialogue parsing experiment was conducted. In the experiment, all 789 dialogues were divided into two data sets.</Paragraph>
      <Paragraph position="1"> One of them is the test data consists of 100 dialogues and the other is the training data consists of 689 dialogues. Furthermore, the training data were divided into 10 training sets.</Paragraph>
      <Paragraph position="2"> By increasing the training data sets, we extracted the probabilistic structural-rules from each data. We then parsed the test data using the rules and ranked their results by probability.</Paragraph>
      <Paragraph position="3"> In the evaluation, the coverage rate, the correct rate, and the N-best correct rate were used.</Paragraph>
      <Paragraph position="4">  The results of the evaluation of the coverage rate and the correct rate are shown in Figure 5.</Paragraph>
      <Paragraph position="5"> The correct rates for each of the training sets, ranked from 1-best to 10-best, are shown in Figure 6.</Paragraph>
      <Paragraph position="6"> In Figure 5, both the coverage rate and the correct rate improved as the training data was increased. The coverage rate of the training set consisting of 689 dialogues was 92%. This means  that the rules that were from the training set enabled the parsing of a wide variety dialogues. The fact the correct rate was 86% shows that, using the rules, the correct structures can be built for a large number of dialogues.</Paragraph>
      <Paragraph position="7"> Three in eight failure dialogues had continued after a guidance for a restaurant. Therefore, we assume that offering guidance to a restaurant is a termination of the dialogue, in which case they couldn't be analyzed. Another three dialogues couldn't be analyzed because they included some LIT which rarely appeared in the training data.</Paragraph>
      <Paragraph position="8"> The cause of failure in the other two dialogues is that an utterance that should be combined with its adjoining utterance is abbreviated.</Paragraph>
      <Paragraph position="9"> Figure 6 shows that the 10-best correct rate for  the training set consisting of 689 dialogues was 80%. Therefore the correct rate is 86%, and approximately 93% (80/86) of the dialogues that can be correctly analyzed include the correct tree in their top-10. According to Figure 5, the number of average parse trees increased with the growth of the training data. However, most of the dialogues that can be analyzed correctly are supposed to include the correct tree in their top-10. Therefore, it is enough to refer to the top-10 in a situation where the correct one should be chosen from the set of candidates, such as in the speech prediction and the dialogue control. As a result, the high-speed processing is achieved.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML