File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/97/a97-1042_relat.xml
Size: 2,644 bytes
Last Modified: 2025-10-06 14:16:04
<?xml version="1.0" standalone="yes"?> <Paper uid="A97-1042"> <Title>Identifying Topics by Position</Title> <Section position="3" start_page="0" end_page="283" type="relat"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> Edmundson's (Edmundson, 1969) laid the groundwork for the Position Method. He introduced four clues for identifying significant words (topics) in a text. Among them, Title and Location are related to the Position Method. Edmundson assigned pos- null itive weights to sentences according to their ordinal position in the text, giving most weight to the first sentence in the first paragraph and the last sentence in the last paragraph. He conducted seventeen experiments to verify the significance of these methods. According to his results, the Title and Location methods respectively scored around 40% and 53% accuracy, where accuracy was measured as the coselection rate between sentences selected by Edmundson's program and sentences selected by a human.</Paragraph> <Paragraph position="1"> Although Edmundson's work is fundamental, his experiments used only 200 documents for training and another 200 documents for testing. Furthermore, he did not trying out other possible combinations, such as the second and third paragraphs or the second-last paragraph. In order to determine where the important words are most likely to be found, Baxendale (Baxendale, 1958) conducted an investigation of a sample of 200 paragraphs. He found that in 85% of paragraphs the topic sentence was in the first sentence and in 7% the final one. Donlan (Dolan, 1980) stated that a study of topic sentences in expository prose showed that only 13% of paragraphs of contemporary professional writers began with topic sentences (Braddock, 1974). Singer and Donlan (Singer and Dolan, 1980) maintain that a paragraph's main idea can appear anywhere in the paragraph, or not be stated at all.</Paragraph> <Paragraph position="2"> Arriving at a negative conclusion, Paijmans (Paijmans, 1994) conducted experiments on the relation between word position in a paragraph and its significance, and found that &quot;words with a high information content according to the tf.idf-based weighting schemes do not cluster in the first and the last sentences of paragraphs or in paragraphs that consist of a single sentence, at least not to such an extent that such a feature could be used in the preparation of indices for Information Retrieval purposes.&quot; In contrast, Kieras (Kieras, 1985) in psychological studies confirmed the importance of the position of a mention within a text.</Paragraph> </Section> class="xml-element"></Paper>