File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/93/p93-1041_intro.xml

Size: 1,392 bytes

Last Modified: 2025-10-06 14:05:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="P93-1041">
  <Title>TEXT SEGMENTATION BASED ON SIMILARITY BETWEEN WORDS</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> A text is not just a sequence of words, but it has coherent structure. The meaning of each word can not be determined until it is placed in the structure of the text. Recognizing the structure of text is an essential task in text understanding, especially in resolving anaphora and ellipsis.</Paragraph>
    <Paragraph position="1"> One of the constituents of the text structure is a text segment. A text segment, whether or not it is explicitly marked, as are sentences and paragraphs, is defined as a sequence of clauses or sentences that display local coherence. It resembles a scene in a movie, which describes the same objects in the same situation.</Paragraph>
    <Paragraph position="2"> This paper proposes an indicator, called the lexical cohesion profile (LCP), which locates segment boundaries in a narrative text. LCP is a record of lexical cohesiveness of words in a sequence of text. Lexical cohesiveness is defined as word similarity (Kozima and Furugori, 1993) computed by spreading activation on a semantic network. Hills and valleys of LCP closely correlate with changing of segments.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML