File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/p06-3001_evalu.xml

Size: 3,706 bytes

Last Modified: 2025-10-06 13:59:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-3001">
  <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics A Flexible Approach to Natural Language Generation for Disabled Children</Title>
  <Section position="7" start_page="4" end_page="5" type="evalu">
    <SectionTitle>
4 Evaluation
</SectionTitle>
    <Paragraph position="0"> The main goal of our system is to develop a communication aid for disabled children. So the performance metrics concentrated on measuring the communication rate that has little importance from NLG point of view. To evaluate our system from NLG point of view we emphasize on the expressiveness and ease of use of the system.</Paragraph>
    <Paragraph position="1"> The expressiveness is measured by the percentage of sentences that was intended by user and also successfully generated by our system. The ease of use is measured by the average number of inputs needed to generate each sentence.</Paragraph>
    <Section position="1" start_page="4" end_page="5" type="sub_section">
      <SectionTitle>
4.1 Measuring Expressiveness
</SectionTitle>
      <Paragraph position="0"> To know the type of sentences used by our intended users during conversation, first we analyzed the communication boards used by disabled children. Then we took part in some actual conversations with some spastic children in a Cerebral Palsy institute. Finally we interviewed their teachers and communication partners.</Paragraph>
      <Paragraph position="1"> Based on our research, we developed a list of around 1000 sentences that covers all types of sentences used during conversation. This list is used as a corpus in both development and evaluation stage of our system. During development the corpus is used to get the necessary templates and for classification of templates (refer sec. 2.1). After development, we tested the scope of our system by generating some sentences that were exactly not in our corpus, but occurred in some sample conversations of the intended users.</Paragraph>
      <Paragraph position="2"> In 96% cases, the system is successful to generate the intended sentence. After analyzing the rest 4% of sentence, we have identified following problems at the current implementation stage.</Paragraph>
      <Paragraph position="3">  head2right The system cannot handle gerunds as object to preposition. (e.g. He ruins his eyes by reading small letters).</Paragraph>
      <Paragraph position="4"> head2right The system is yet not capable to generate correct sentence with an introductory 'It'. (e.g. It is summer). In these situations the sentence is correctly generated when 'It' is given as an agent, which is not intended.</Paragraph>
    </Section>
    <Section position="2" start_page="5" end_page="5" type="sub_section">
      <SectionTitle>
4.2 Measuring ease of use
</SectionTitle>
      <Paragraph position="0"> To calculate the performance of the system, we measured the number of inputs given by user for generating sentence. The input consists of words, tense choice, mood option and sense choice given by user. Next we plot the number of inputs w.r.t. the number of words for each sentence.</Paragraph>
      <Paragraph position="1"> Fig. 3 shows the plot. It can be observed from the plot that as the number of words increases (i.e.</Paragraph>
      <Paragraph position="2"> for longer sentences), the ratio of number of inputs to number of words decreases. So effort from users' side will not vary remarkably with sentence length. The overall communication rate is found to be 5.52 words/min (27.44 characters/min) that is better than (Stephanidis, 2003). Additionally it is also observed that the communication rate is increasing with longer conversations. null</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML