File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-1119_intro.xml

Size: 4,687 bytes

Last Modified: 2025-10-06 14:01:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-1119">
  <Title>A Sentence Reduction Using Syntax Control</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Most of the researches in automatic summarization were focused on extraction or identifying the important clauses and sentences, paragraphs in texts (Inderject Mani and Mark Maybury, 1999). However, when humans produce summaries of documents, they used to create new sentences that are grammatical, that cohere with one another, and capture the most salient parts of information in the original document. Sentence reduction is the problem to remove some redundant words or some phrases from the original sentence by creating a new sentence in which the gist meaning of the original sentence was unchanged.</Paragraph>
    <Paragraph position="1"> Methods of sentence reduction have been used in many applications. Grefenstette (G.Grefenstette, 1998) proposed removing phrases in sentences to produce a telegraphic text that can be used to provide audio scanning services for the blind. Dolan (S.H. Olivers and W.B.Dolan, 1999) proposed removing clauses in sentences before indexing document for information retrieval. Those methods remove phrases based on their syntactic categories but not rely on the context of words, phrases and sentences around. Without using that information can be reduced the accuracy of sentence reduction problem. Mani and Maybury also present a process of writing a reduced sentence by reversing the original sentence with a set of revised rules to improve the performance of summarization. (Inderject Mani and Mark Maybury, 1999).</Paragraph>
    <Paragraph position="2"> Jing and McKeown(H. Jing, 2000) studied a new method to remove extraneous phrase from sentences by using multiple source of knowledge to decide which phrase in the sentences can be removed. The multiple sources include syntactic knowledge, context information and statistic computed from a corpus that consists of examples written by human professional. Their method prevented removing some phrases that were relative to its context around and produced a grammatical sentence.</Paragraph>
    <Paragraph position="3"> Recently, Knight and Marcu(K.Knight and D.Marcu, 2002) demonstrated two methods for sentence compression problem, which are similar to sentence reduction one. They devised both noisy-channel and decision tree approach to the problem. The noisy-channel framework has been used in many applications, including speech recognition, machine translation, and information retrieval. The decision tree approach has been used in parsing sentence. (D. Magerman, 1995)(Ulf Hermijakob and J.Mooney, 1997) to define the rhetorical of text documents (Daniel Marcu, 1999).</Paragraph>
    <Paragraph position="4"> Most of the previous methods only produce a short sentence whose word order is the same as that of the original sentence, and in the same language, e.g., English.</Paragraph>
    <Paragraph position="5"> When nonnative speaker reduce a long sentence in foreign language, they usually try to link the meaning of words within the original sentence into meanings in their language. In addition, in some cases, the reduced sentence and the original sentence had their word order are difference. Therefore, two reduced sentences are performed by non-native speaker, one is the reduced sentence in foreign language and another is in their language.</Paragraph>
    <Paragraph position="6"> Following the behavior of nonnative speaker, two new requirements have been arisen for sentence reduction problem as follows: 1) The word order of the reduced sentence may different from the original sentence.</Paragraph>
    <Paragraph position="7"> 2) Two reduced sentences in two difference languages can be generated.</Paragraph>
    <Paragraph position="8"> With the two new perspectives above, sentence reduction task are useful for many applications such as: information retrieval, query text summarization and especially cross-language information retrieval.</Paragraph>
    <Paragraph position="9"> To satisfy these new requirements, we proposed a new algorithm using semantic information to simulate the behavior of nonnative-speaker. The semantic information obtained from the original sentence will be integrated into the syntax tree through syntax control. The remainder of this paper will be organized as follows: Section 2 demonstrated a method using syntactic control to reduced sentences. Section 3 shows implementation and experiments. Section 4 gives some conclusions and remained problems to be solved in future.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML