File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/p98-1065_concl.xml
Size: 1,095 bytes
Last Modified: 2025-10-06 13:58:06
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-1065"> <Title>Thematic segmentation of texts: two methods for two kinds of texts</Title> <Section position="5" start_page="395" end_page="395" type="concl"> <SectionTitle> 9. Conclusion </SectionTitle> <Paragraph position="0"> From a first method that considers paragraphs as basic units and computes a similarity measure between adjacent paragraphs for building larger thematic units, we have developed a second method on the same principles, making use of a lexical collocation network to augment the vectorial representation of the paragraphs. We have shown that this second method, if well adapted for processing such texts as newspapers articles, has less good results on scientific texts, because the characteristic terms do not emerge as well as in the first method, due to the addition of related words. So, in order to build a text segmentation system independent of the kind of processed text, we have proposed to make a shallow analysis of the text characteristics to apply the suitable method.</Paragraph> </Section> class="xml-element"></Paper>