File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/00/w00-0410_metho.xml
Size: 15,491 bytes
Last Modified: 2025-10-06 14:07:20
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-0410"> <Title>Using Summarization for Automatic Briefing Generation</Title> <Section position="4" start_page="101" end_page="101" type="metho"> <SectionTitle> 3 Intelligent Multimedia Presentation Generation </SectionTitle> <Paragraph position="0"> The author of a briefing may choose to flesh out as little of the tree as desired, with the caveat that the temporal ordering relations for non-narrative nodes need to be provided by her.</Paragraph> <Paragraph position="1"> When a media object is generated at a node by a create goal, the running text and captions are generated by the system. The motivation for this is obvious: when a summarization filter (which is a program under our control) is generating a media object, we can often provide sufficient recta-information about that object to generate a short caption and some running text. By default, all segues and spatial layout relations are also specified by the system, so the author does not have to know about these unless she wants to.</Paragraph> <Paragraph position="2"> Finally, the decision as to when to produce audio, when not specified by the author, is left to the system.</Paragraph> <Paragraph position="3"> When summarization filters are used (for create goals), the media type of the output is specified as a parameter to the filter. This media type may be converted to some other type by the system, e.g., text to speech conversion using Festival (Taylor et al. 98). By default, all narrative nodes attempt to realize their goals as a speech media type, using rules based on text length and tnmcatability to less than 250 bytes to decide when to use text-to-speech. The truncation algorithm is based on dropping syntactic constituents, using a method similar to (Mani et al. 99). Captions are always realized, in addition, as text (i.e., they have a text realization and a. possible audio realization).</Paragraph> <Paragraph position="4"> Spatial layout is decided in the Presentation Generator, after all the individual media objects are created along with their temporal constraints by the Content Executor. The layout algorithm walks through the temporal ordering in sequence, allocating a segment to each set of objects that is designated to occur simultaneously (grouped by par in the temporal constraints). Each segment can have up to 4 frames, in each of which a media object is displayed (thus, no more than 4 media objects can be displayed at the same time). Since media objects declared to be simultaneous (using par) in the temporal constraints will go together in a separate segment, the temporal constraints determine what elements are grouped together in a segment. The layout within a segment handles two special cases. Captions are placed directly undemeath their associated media object.</Paragraph> <Paragraph position="5"> Running text, when realized as text, is placed beside the media object being described, so that they are paired together visually. Thus, coherence of a segment is influenced mainly by the temporal constraints (which have been fleshed out by the Content Creator to include narrative nodes), with further handling of special cases. Of course, an individual summarization filter may choose to coordinate component multimedia objects in particular ways in the course of generating a composite multimedia object.</Paragraph> <Paragraph position="6"> Details such as duration and onset of particular frames are specified in the translation to SMIL.</Paragraph> <Paragraph position="7"> Duration is determined by the number of frames present in a segment, unless there is an audio media object in the segment (this media object may have a spatial representation, e.g., as an audio icon, or it may not). If an audio media object occurs in a frame, the duration of all media objects in that frame is equal to the length of all the audio files in the segment. If there is no audio present in a segment, the duration is ot seconds (or has a default value of 5) times the number of frames created.</Paragraph> </Section> <Section position="5" start_page="101" end_page="104" type="metho"> <SectionTitle> 4 Summarization Filters </SectionTitle> <Paragraph position="0"> As mentioned above, create goals are satisfied by summarization filters, which create new media objects summarizing information sources.</Paragraph> <Paragraph position="1"> These programs are called summarization filters because in the course of condensing information, they take input information and turn it into some more abstract and useful representation, filtering out unimportant information. Such filters provide a novel way of carrying out content selection and creation for automated presentation generation.</Paragraph> <Paragraph position="2"> Our approach relies on component-based software composition, i.e., assembly of software units that have contractually specified interfaces that can be independently deployed and reused.</Paragraph> <Paragraph position="3"> The idea of assembling complex language processing programs out of simpler ones is hardly new; however, by employing current industry standards to specify the interaction between the components, we simultaneously increase the robustness of the system, ensure the reusability of individual components and create a more fully plug-and-play capability. Among the core technology standards that support this plug-and-play component assembly capability are (a) Java interfaces, used to specify functions that all summarization components must implement in order to be used in the system, (b) the JavaBeans standard, which allows the parameters and methods of individual components to be inspected by the system and revealed to the users (c) the XML markup standard, which we have adopted as an inter-component communication language. Using these technologies, legacy or third-party summarizers are incorporated into the system by &quot;wrapping&quot; them so as to meet the interface specification of the system. These technologies also make possible a graphical environment to assemble and configure complex summarization filters from individual summarization components.</Paragraph> <Paragraph position="4"> Among the most important wins over the traditional &quot;piping&quot; approach to filter assembly is the ability to impose build-time restrictions on the component assembly, disallowing &quot;illegal&quot; compositions, e.g. component X cannot provide input to component Y unless X's output type corresponds to Y's input type. Build-time restrictions such as these play a clear role in increasing the overall robustness of the run-time summarization system. Another build-time win lies in the ability of JavaBeans to be serialized, i.e., written to disk in such a way as to preserve ~he state of its parameters settings, ensuring that every component in the system can be configured and run at different times independently of whether the component provides a parameter file facility.</Paragraph> <Paragraph position="5"> Establishing the standard functions required of a summarization filter is challenging on several fronts. One class of functions required by the interface is necessary to handle the technicalities of exchanging information between otherwise discrete components. This set includes functions for discovering a component's input and output types, for handling messages, exceptions and events passed between components and for interpreting XML based on one or more system-wide document type definitions (DTDs). The other, more interesting set of functions gets to the core of summarization functionality. Selecting these functions involves identifying parameters likely to be broadly applicable across most or all summarizers and finding ways to group them and/or to generalize them. This is desirable in order to reduce the burden on the end user of understanding the subtle differences between the various settings in the summarizers available to her.</Paragraph> <Paragraph position="6"> An. example of the difficulty inherent in this endeavor is provided by the compression (summary length divided by source length) vs.</Paragraph> <Paragraph position="7"> reduction (l's complementof compression) vs.</Paragraph> <Paragraph position="8"> target length paradigm. Different summarizers will implement one or more of these. The wrapper maps from the high-level interface function, where the application/user can specify either compression or target length, but not both, to the individual summarizer's representation.</Paragraph> <Paragraph position="9"> Thus, a user doesn't need to know which representation(s) a particular summarizer uses for reduction/compression.</Paragraph> <Paragraph position="10"> A vanilla summarization Bean includes the following functionality, which every summarizer must be able to provide methods for: source: documents to be summarized (this can be a single document, or a collection) reduction-rate: either summary size/source size, or target length audience: user-focused or generic (user-focused requires the specification of a bag of terms, which can be of different types) output-type: specific data formats (specified by DTDs) The above are parameters which we expect all summarizers to support. More specialized summarizer beans can be constructed to reflect groupings of summarizers. Among other parameters are output-fluency, which specifies whether a textual summary is to be made up of passages (sentences, paras, blocks), named entities, lists of words, phrases, or topics, etc. Given that definitions of summarization in more theoretical terms have not been entirely satisfactory (Mani 2000), it is worth noting that the above vanilla Bean provides an operational definition of what a summarizer is.</Paragraph> <Paragraph position="11"> text, and segues. The captions and running text, when not provided by the filters, are provided by the script input. In the case of retrieve goals, the objects may not have any meta-information, in which case a default caption and running-text is generated. Clearly, a system's explanatory narrative will be enhanced by the availability of rich meta-information.</Paragraph> <Paragraph position="12"> The segues are provided by the system. For example, an item with a label &quot;A biography of bin Laden&quot; could result in a generated segue &quot;Here is a biography of bin Laden&quot;. The Content Creator, when providing content for narrative nodes, uses a variety of different canned text patterns. For the above example, the pattern would be &quot;Here is @6.label&quot;, where 6 is the number of a non-narrative node, with label being its label.</Paragraph> <Paragraph position="13"> Composition In addition to its practical utility in the ability to assimilate, combine and reuse components in different combinations, and to do so within a GUI, this approach is interesting because it allows powerful summarization functions to be created by composing together simpler tools.</Paragraph> <Paragraph position="14"> (Note that this is different from automatically finding the best combination, which our system does not address). For example, Figure 2 illustrates a complex filter created by using a GUI to compose together a named entity extractor, a date extractor, a component which discovers significant associations between the two and writes the result to a table, and a visualizer which plots the results as a graph. The resulting summarizer takes in a large collection of documents, and produces as a summary a graph (a jpeg) of salient named entity mentions over time. Each of its components can be easily reused within the filter composition system to As mentioned above, the system can construct a narrative to accompany the briefing. Narrative nodes are generated to cover captions, running All segue nodes are by default generated automatically by the system, based on node labels. We always introduce a segue node at the beginning of the presentation (called a preamble node), which provides a segue covering the &quot;crown&quot; of the tree, i.e., all nodes upto a particular depth d from the root (d=2) are marked with segue nodes. A segue node is also produced at the end (called a coda). (Both preamble and segue can of course be specified by the author if desired).</Paragraph> <Paragraph position="15"> For introducing intervening segue nodes, we use the following algorithm based on the distance between nodes and the height in the tree, We traverse the non-narrative leaves of the tree in their temporal order, evaluating each pair of adjacent nodes A and B where A precedes B temporally. A segue is introduced between nodes A and B if either (a) the maximum of the 2 distances from A and B to their least common ancestor is greater than 3 nodes or (b) the sum of the 2 distances from A and B to the least common ancestor is greater than 4 nodes. This is less intrusive than introducing segues at random or between every pair of successive nodes, and appears to perform better than introducing a segue at each depth of the tree.</Paragraph> </Section> <Section position="6" start_page="104" end_page="105" type="metho"> <SectionTitle> 6 An Example </SectionTitle> <Paragraph position="0"> We currently have a working version of the system with a variety of different single and multi-document summarization filters. Figure 3 shows an input script created by an author (the scripts in Figure 3 and 4 are schematic representations of the scripts, rather than the raw XML). The script includes two create goals, one with a single-document generic summarization filter, the other with a multi-document user-focused summarization filter. Figure 4 shows the ground script which was created automatically by the Content Creator component. Note the addition of media type specifications, the introduction of narrative nodes, and the extension of the temporal constraints. The final presentation generated is shown in Figure 5.</Paragraph> <Paragraph position="1"> Here we show screen dumps of the six SMIL segments produced, with the audio if any for each segment indicated in this paper next to an audio icon.</Paragraph> </Section> <Section position="7" start_page="105" end_page="105" type="metho"> <SectionTitle> 7 Status </SectionTitle> <Paragraph position="0"> The summarization filters have incorporated several summarizers, including some that have been evaluated in the DARPA SUMMAC conference (Mani et al. 99-1). These carry out both single-document and multi-document summarization, and include a preliminary biographical summarizer we have developed.</Paragraph> <Paragraph position="1"> The running text for the biography table in the second-last segment of Figure 5 is produced from meta-information in the table XML generated by the biographical summarizer. The production method for running text uses canned text which should work for any input table conforming to that DTD.</Paragraph> <Paragraph position="2"> The summarization filters are. being tested as part of a DARPA situated test with end-users.</Paragraph> <Paragraph position="3"> The briefing generator itself has been used internally to generate numerous briefings, and has been demonstrated as part of the DARPA system. We also expect to carry out an evaluation to assess the extent to which the automation described here provides efficiency gains in briefing production.</Paragraph> </Section> class="xml-element"></Paper>