File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/a00-2004_concl.xml

Size: 1,590 bytes

Last Modified: 2025-10-06 13:52:39

<?xml version="1.0" standalone="yes"?>
<Paper uid="A00-2004">
  <Title>Advances in domain independent linear text segmentation</Title>
  <Section position="6" start_page="29" end_page="31" type="concl">
    <SectionTitle>
5 Conclusions and future work
</SectionTitle>
    <Paragraph position="0"> A segmentation algorithm has two key elements, a clustering strategy and a similarity me~sure. Our o'~4  results show divisive clustering (R98) is more precise than sliding window (H94) and lexical chains (K98) for locating topic boundaries.</Paragraph>
    <Paragraph position="1"> Four similarity measures were examined. The cosine coefficient (R98(s,co,)) and dot density measure (R98(m,(lot)) yield similar results. Our spread activation based semantic measure (R98( ..... ,)) improved a.ccura(:y. This confirms that although Kozima's apl)roaeh (Kozima, 1993) is computationally expensive, it does produce more precise segmentation. Tile most significant improvement was due to our ranking scheme which linearises the cosine coefficient. Our exl)eriments demonstrate that given insuffi(:lent data, tile qualitative behaviour of the cosine m(,asul'e is indeed more reliable than the actual valII(~S. null Although our evaluation scheme is sufficient for this (:omparative study, further research requires a large scale, task independent benchmark. It would be interesting to corot)are C99 with the multi-source method described in (Beeferman et al., 1999) using the TDT corpus. We would also like to develop a linear time and multi-source version of the algorithIIl. null</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML