File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/w98-0204_concl.xml
Size: 2,450 bytes
Last Modified: 2025-10-06 13:58:15
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-0204"> <Title>Texplore exploring expository texts via hierarchical representation</Title> <Section position="6" start_page="30" end_page="30" type="concl"> <SectionTitle> 5 Conclusions and future plans </SectionTitle> <Paragraph position="0"> We propose a new approach for dynamic presentation of the content of expository text based on uncovering and vis~aliT.ing its hierarchical structure. Using this &quot;electronic&quot; table-of-contents the user has the advautage of exploring the text while staying within the full context of the exploration path. The user has also full control over the granularity of the displayed information. These characteristics are beneficial both for navigating in the text as well as communicatiug its content, while overcoming drawbacks of existing snmmarization methods.</Paragraph> <Paragraph position="1"> The weakest point in Texplore is the generation of headings. The current approach is too shnplistic, both in the criteria for selecting NPs and in the way they are composed to headings.</Paragraph> <Paragraph position="2"> We have analyzed the way headings are formed by human authors (Yaari et al., ) and the results were used to form a machlne-learnlng system which identifies the NPs of a given section using multiple sources of information. The system constructs headings for the text hierarchy using a fixed set of syntactic formats (found to be common in heading syntax). We are in the process of integrating this system into Texplore.</Paragraph> <Paragraph position="3"> The hierarchical structure segmentation is also too simplistic, based solely on the proximity of term vectors. Again, we are working on a machine learning system that uses a set of structured articles to learn segmentation rules.</Paragraph> <Paragraph position="4"> The basic approach is to divide the task into two steps, determining the boundaries and forming the hierarchy. We are using various cohesion cues, associated with each paragraph, as the learning attributes: lexical similarity, cue tags, cue words, number of starting and continuing lexical chains, etc.</Paragraph> <Paragraph position="5"> Using machine lemming has the advantage of a built-in evaluation against the segmentation done by human subjects. We also plan to evaluate the usefulness of the hierarchical presentation in terms of reading effectiveness.</Paragraph> </Section> class="xml-element"></Paper>