File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/a00-1043_metho.xml

Size: 9,751 bytes

Last Modified: 2025-10-06 14:07:03

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-1043">
  <Title>D B E G /\ /',, A C F H ? A D t B E G C F H Reduced: A B D G H Figure 3: Reduced form by a human D B E G A C F H Reduced: B C D Input: A B C D E F G H Figure 4: Reduced form by the program</Title>
  <Section position="4" start_page="311" end_page="313" type="metho">
    <SectionTitle>
3 Evaluation
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="311" end_page="313" type="sub_section">
      <SectionTitle>
3.1 The evaluation scheme
</SectionTitle>
      <Paragraph position="0"> We define a measure called success rate to evaluate the performance of our sentence reduction program.</Paragraph>
      <Paragraph position="1">  Original sentence : When it arrives sometime next year in new TV sets, the V-chip will give parents a new and potentially revolutionary device to block out programs they don't want their children to see.</Paragraph>
      <Paragraph position="2"> Reduction program: The V-chip will give parents a new and potentially revolutionary device to block out programs they don't want their children to see.</Paragraph>
      <Paragraph position="3">  The success rate computes the percentage of system's reduction decisions that agree with those of humans.</Paragraph>
      <Paragraph position="4"> We compute the success rate in the following way.</Paragraph>
      <Paragraph position="5"> The reduction process can be considered as a series of decision-making process along the edges of a sentence parse tree. At each node of the parse tree, both the human and the program make a decision whether to remove the node or to keep it. If a node is removed, the subtree with that node as the root is removed as a whole, thus no decisions are needed for the descendants of the removed node. If the node is kept, we consider that node as the root and repeat this process.</Paragraph>
      <Paragraph position="6">  Suppose we have an input sentence (ABCDE-FGH), which has a parse tree shown in Figure 2. Suppose a human reduces the sentence to (ABDGH), which can be translated to a series of decisions made along edges in the sentence parse tree as shown in  node it points to will be kept, and &amp;quot;n&amp;quot; means the node will be removed. Suppose the program reduces the sentence to (BCD), which can be translated similarly to the annotated tree shown in Figure 4. We can see that along five edges (they are D--+B, D--+E, D--+G, B--+A, B-+C), both the human and the program made decisions. Two out of the five decisions agree (they are D--+B and D--4E), so the success rate is 2/5 (40%). The success rate is defined as: # of edges along which the human and the program have made success rate = the same decision the total # of edges along which both the human and the progam have made decisions  Note that the edges along which only the human or the program has made a decision (e.g., G--+F and G--+F in Figure 3 and Figure 4) are not considered in the computation of success rate, since there is no agreement issue in such cases.</Paragraph>
    </Section>
    <Section position="2" start_page="313" end_page="313" type="sub_section">
      <SectionTitle>
3.2 Evaluation result
</SectionTitle>
      <Paragraph position="0"> In the evaluation, we used 400 sentences in the corpus to compute the probabilities that a phrase is removed, reduced, or unchanged. We tested the program on the rest 100 sentences.</Paragraph>
      <Paragraph position="1"> Using five-fold validation (i.e., chose different 100 sentences for testing each time and repeating the experiment five times), The program achieved an average success rate of 81.3%. If we consider the baseline as removing all the prepositional phrases, clauses, to-infinitives and gerunds, the baseline performance is 43.2%.</Paragraph>
      <Paragraph position="2"> We also computed the success rate of program's decisions on particular types of phrases. For the decisions on removing or keeping a clause, the system has a success rate of 78.1%; for the decisions on removing or keeping a to-infinitive, the system has a success rate of 85.2%. We found out that the system has a low success rate on removing adjectives of noun phrases or removing adverbs of a sentence or a verb phrase. One reason for this is that our probability model can hardly capture the dependencies between a particular adjective and the head noun since the training corpus is not large enough, while the other sources of information, including grammar or context information, provide little evidence on whether an adjective or an adverb should be removed. Given that whether or not an adjective or an adverb is removed does not affect the conciseness of the sentence significantly and the system lacks of reliability in making such decisions, we decide not to remove adjectives and adverbs.</Paragraph>
      <Paragraph position="3"> On average, the system reduced the length of the 500 sentence by 32.7% (based on the number of words), while humans reduced it by 41.8%.</Paragraph>
      <Paragraph position="4"> The probabilities we computed from the training corpus covered 58% of instances in the test corpus.</Paragraph>
      <Paragraph position="5"> When the corpus probability is absent for a case, the system makes decisions based on the other two sources of knowledge.</Paragraph>
      <Paragraph position="6"> Some of the errors made by the system result from the errors by the syntactic parser. We randomly checked 50 sentences, and found that 8% of the errors made by the system are due to parsing errors.</Paragraph>
      <Paragraph position="7"> There are two main reasons responsible for this relative low percentage of errors resulted from mistakes in parsing. One reason is that we have taken some special measures to avoid errors introduced by mistakes in parsing. For example, PP attachment is a difficult problem in parsing and it is not rare that a PP is wrongly attached. Therefore, we take this into account when marking the obligatory components using subcategorization knowledge from the lexicon (step 2) - we not only look at the PPs that are attached to a verb phrase, but also PPs that are next to the verb phrase but not attached, in case it is part of the verb phrase. We also wrote a pre-processor to deal with particular structures that the parser often has problems with, such as appositions.</Paragraph>
      <Paragraph position="8"> The other reason is that parsing errors do not always result in reduction errors. For example, given a sentence &amp;quot;The spokesperson of the University said that ...', although that-clause in the sentence may have a complicated structure and the parser gets it wrong, the reduction system is not necessarily affected since it may decide in this case to keep that-clause as it is, as humans often do, so the parsing errors will not matter in this example.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="313" end_page="314" type="metho">
    <SectionTitle>
4 Discussion and related work
</SectionTitle>
    <Paragraph position="0"> The reduction algorithm we present assumes generic summarization; that is, we want to generate a summary that includes the most important information in an article. We can tailor the reduction system to queries-based summarization. In that case, the task of the reduction is not to remove phrases that are extraneous in terms of the main topic of an article, but phrases that are not very relevant to users' queries. We extended our sentence reduction program to query-based summarization by adding another step in the algorithm to measure the relevance of users' queries to phrases in the sentence. In the last step of reduction when the system makes the final decision, the relevance of a phrase to the query is taken into account, together with syntactic, context, and corpus information.</Paragraph>
    <Paragraph position="1"> Ideally, the sentence reduction module should interact with other modules in a summarization system. It should be able to send feedback to the extraction module if it finds that a sentence selected by the extraction module may be inappropriate (for example, having a very low context importance score).</Paragraph>
    <Paragraph position="2"> It should also be able to interact with the modules that run after it, such as the sentence combination module, so that it can revise reduction decisions according to the feedback from these modules.</Paragraph>
    <Paragraph position="3"> Some researchers suggested removing phrases or clauses from sentences for certain applications.</Paragraph>
    <Paragraph position="4"> (Grefenstette, 1998) proposed to remove phrases in sentences to produce a telegraphic text that can be used to provide audio scanning service for the blind. (Corston-Oliver and Dolan, 1999) proposed to remove clauses in sentences before indexing documents for Information Retrieval. Both studies removed phrases based only on their syntactic categories, while the focus of our system is on deciding when it is appropriate to remove a phrase.</Paragraph>
    <Paragraph position="5"> Other researchers worked on the text simplifica- null tion problem, which usually involves in simplifying text but not removing any phrases. For example, (Carroll et al., 1998) discussed simplifying newspaper text by replacing uncommon words with common words, or replacing complicated syntactic structures with simpler structures to assist people with reading disabilities. (Chandrasekar et al., 1996) discussed text simplification in general. The difference between these studies on text simplification and our system is that a text simplification system usually does not remove anything from an original sentence, although it may change its structure or words, but our system removes extraneous phrases from the extracted sentences.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML