File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/03/w03-1119_evalu.xml

Size: 4,859 bytes

Last Modified: 2025-10-06 13:59:04

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1119">
  <Title>A Sentence Reduction Using Syntax Control</Title>
  <Section position="4" start_page="0" end_page="0" type="evalu">
    <SectionTitle>
3 Experiments and Discussion
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Experiment Data
</SectionTitle>
      <Paragraph position="0"> We used the same corpus(K.Knight and D.Marcu, 2002) with 1067 pair of sentence and its reduction.</Paragraph>
      <Paragraph position="1"> We manually changed the order of some reduced sentences in that corpus while keep their meaning.</Paragraph>
      <Paragraph position="2"> We manually build a set of syntax control for that corpus for our reduction algorithm using syntax control. The set of semantic symbols was described such as, HUMAN, ANIMAL, THINGS, etc. We make 100 pair of sentences with the order of a reduction sentence is different from its original sentence. Afterward, those sentences are to be combined with the corpus above in order to confirm that our method can deal with the changeable word order problem.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Experiment Method
</SectionTitle>
      <Paragraph position="0"> To evaluate our reduction algorithms, we randomly selected 32 pair of sentences from our parallel corpus, which will refer to as the Test corpus.</Paragraph>
      <Paragraph position="1"> We used 1035 sentence pairs for training with the reduction based decision tree algorithm. We used test corpus to confirm that our methods using semantic-information will outperform than the decision tree method without semantic-information (K.Knight and D.Marcu, 2002). We presented each original sentence in the test corpus to three judges who are Vietnamese and specialize in English, together with three sentence reductions of it: The human generated reduction sentence, the outputs of the sentence reduction based syntax control and the output of the baseline algorithm. The judges were told that all outputs were generated automatically. The order of the outputs was scrambled randomly across test cases. The judges participated in two experiments. In the first experiment, they were asked to determine on a scale from 1 to 10 how well the systems did with respect to selecting the most important words in the original sentence. In the second experiment, they were asked to determine on a scale from 1 to 10 how grammatical the outputs were. The outputs of our methods include both reduced sentences in English and Vietnamese. In the third experiment, we tested on the randomly of 32 sentences from 100 sentences whose had word order between input and output are different.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.3 Experiment Results
</SectionTitle>
      <Paragraph position="0"> Using the first and the second experiment method, we had two table results as follows.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.4 Discussion
</SectionTitle>
      <Paragraph position="0"> Table 1 shows the compression of three reduction methods in comparing with human for English language. The grammatically of semantic control achieved a high results because we used the syntax control from human expert. The sentence reduction decision based is yielded a smallest result. We suspect that the requirement of word order may affect the grammatically. Table 1 and Table 3 also indicates that our new method achieved the importance of words are outperform than the baseline algorithm due to semantic information. This was because our method using semantic information to avoid deleting important words. Following our point, the base line method should integrate with semantic information within the original sentence to enhance the accuracy.</Paragraph>
      <Paragraph position="1"> Table 2 shows the outputs of our method into Vietnamese language, the baseline method cannot generate the output into Vietnamese language. The syntax control method achieved a good enough results in both grammatically and importance aspects.</Paragraph>
      <Paragraph position="2"> The comparison row in the Table 1 and the Table 2 also reported that the baseline yields a shorter output than syntax control method.</Paragraph>
      <Paragraph position="3"> Table 3 shows that when we selected randomly 32 sentence pairs from 100 pairs of sentences those had words order between input and output are different, we have the syntax method change a bit while the baseline method achieved a low result. This is due to the syntax control method using rule knowledge based while the baseline was not able to learn with that corpus that.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML