File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-0702_concl.xml

Size: 5,202 bytes

Last Modified: 2025-10-06 13:57:56

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0702">
  <Title>[6\] Defense Advanced Research Prolects Agency Fourth Message Understanding Conference (MUC-4), McLean, VlrgTaua, 1992 Software and Intelhgent Systems</Title>
  <Section position="8" start_page="3" end_page="8" type="concl">
    <SectionTitle>
5 Related and future work
</SectionTitle>
    <Paragraph position="0"> Our framework clearly attempts to balance the conflictmg reqmrements of the two primary approaches to the document summansabon task By design, we target any text type, document genre, and domain of chscourse, and thus compromise by forgomg m-depth analysls of the fun meaning of the document On the other hand, our content charactensation procedure remains closer to the core meaning than the approximations offered by tradlbonal passage extraction algorithms, with certain sentence- or paragraph-szz.ed passages deemed mdlcabve of content by means of smulanty scoring metrics By choosing a phrasal granularity of representatmn--rather than sentence- or paragraph-based--we can obtain a more refined view into highly relevant fragments of the source, this also offers a finer-grained control for adjusting the level of detad m capsule ovenaews Exploiting a notmn of dmcourse conbgmty and coherence for the purposes of full source coverage and continuous context maintenance ensures that the entire text of the document Is uniformly represented m the overview F1naUy, by utthsmg a strong lmgutstic notion of salience, the procedure can bmld a richer representation of the discourse objects, and exploit thin for reformed dec~ons about their pronunence, importance, and ultunately toplcahty, sahence thus becomes central to denying a strong sense of a document's &amp;quot;aboumess&amp;quot; At present, sahence calculabons are driven from contextual analys~ and syntactac conslderabons focusing on dtscourse objects and thetr behawour m the text Given the power of our phrasal grammars, however, it is concelvable to extend the framework to Identify, exphatly represent, and smularly rank, higher order expressions (e g events, or propertms of obMcts ) Thin may not ultnnately change the appearance of a capsule overview, however, it will allow for even more reformed )udgements about relevance of dmcourse entities More mtportantly, it ts a necessary step towards developing more sophmticated chscourse processing techmques (such as those discussed m Sparck Jones \[28\]), which are ulbmately essenbal for the automabc construction of true summaries Currentl3~ we analyse m&amp;vidual documents, unhke McKeown and Radev \[21\], there.is no nobon of calculating sahence across the boundaries of more than one document---even If we were to know m advance that they are somehow related However, we are experimenting using topic stamps as representabon and navlgatwn &amp;quot;labels&amp;quot; m a multi-document space, we thus plan to fold m awareness of document boundaries (as an extension to tracking the effects of dlscourse segment boundaries wltinn a smgle document) Even though the approach presented here can be construed, m some sense, as a type of passage extractmn, it is considerably less exposed to problems hke pronouns out of context, or discontinuous sentences presented as conbguous passages (cf Paice \[22\]) This ~ a direct consequence of the fact that we employ anaphora resolubon to construct a chscourse model with exphtnt representation of objects, and use syntactic criteria to extract coherent phrasal umts For the same reason, topic stamps are quantfflably adequate content abstractions see Kennedy and Boguraev \[13\] for evaluatlon of the anaphora resolubon algorithm, we are also m the process of designing a user study to determine the u~lt3~ from usability point of view, of capsule overviews as defined here Recent work m summansatmn has begun to focus closer on the utihty of document fragments with granularity below that of a sentence Thus McKeown and Radev \[21\] pro-actively seek, and use to great leverage, certain cue phrases whzch denote specific rhetorical and/or rater-document relationships Mahesh \[18\] uses phrases as &amp;quot;sentence surrogates'; m a process called sentence Sunphficabon, hm rationale is that with hypertext, a * phrase can be used as a place-holder for the complete senttence, and/or is a more conveniently mampulated, compared to a sentence Even m passage extraction work; notions of multi-word expressmns have found use as one of several features clnvmg a statmtical classffer scoring sentences for inclusion m a sentence-based summary (Kuplec et al \[15\]) In all of these examples, the use of a phrase is somewhat peripheral to the fundamental assumpbons of the particular approach, more to the point, it is a different kind of object that the summary is composed from (a template, m the case of \[21\]), or that the underlying machinery is seeking to identify (sentences, m the case of \[18\] and \[15\]) In contrast, our adophon  of phrasal expressions as the atonuc building blocks for capsule overwews ~s central to the design, it drives the entire analysm process, and ~ the undew.mmng for our d~ourse representation ..</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML