File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1004_intro.xml

Size: 2,172 bytes

Last Modified: 2025-10-06 14:03:31

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1004">
  <Title>Minimum Cut Model for Spoken Lecture Segmentation</Title>
  <Section position="4" start_page="25" end_page="25" type="intro">
    <SectionTitle>
2 Previous Work
</SectionTitle>
    <Paragraph position="0"> Most unsupervised algorithms assume that fragments of text with homogeneous lexical distribution correspond to topically coherent segments.</Paragraph>
    <Paragraph position="1"> Previous research has analyzed various facets of lexical distribution, including lexical weighting, similarity computation, and smoothing (Hearst, 1994; Utiyama and Isahara, 2001; Choi, 2000; Reynar, 1998; Kehagias et al., 2003; Ji and Zha, 2003).</Paragraph>
    <Paragraph position="2"> The focus of our work, however, is on an orthogonal yet fundamental aspect of this analysis -- the impact of long-range cohesion dependencies on segmentation performance. In contrast to previous approaches, the homogeneity of a segment is determined not only by the similarity of its words, but also by their relation to words in other segments of the text. We show that optimizing our global objective enables us to detect subtle topical changes.</Paragraph>
    <Paragraph position="3"> Graph-Theoretic Approaches in Vision Segmentation Our work is inspired by minimum-cut-based segmentation algorithms developed for image analysis. Shi and Malik (2000) introduced the normalized-cut criterion and demonstrated its practical benefits for segmenting static images.</Paragraph>
    <Paragraph position="4"> Our method, however, is not a simple application of the existing approach to a new task. First, in order to make it work in the new linguistic framework, we had to redefine the underlying representation and introduce a variety of smoothing and lexical weighting techniques. Second, the computational techniques for finding the optimal partitioning are also quite different. Since the minimization of the normalized cut is NP-complete in the general case, researchers in vision have to approximate this computation. Fortunately, we can find an exact solution due to the linearity constraint on text segmentation.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML