File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/02/w02-0206_relat.xml
Size: 4,955 bytes
Last Modified: 2025-10-06 14:15:37
<?xml version="1.0" standalone="yes"?> <Paper uid="W02-0206"> <Title>An Experiment to Evaluate the Effectiveness of Cross-Media Cues in</Title> <Section position="3" start_page="0" end_page="0" type="relat"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 Computational linguistics </SectionTitle> <Paragraph position="0"> Cross-media cues are similar in some respects to discourse cue phrases. First, some functions of cross-media cues can be classified using discourse coherence relations such as Preparation, Restatement, Summary, Evaluation, and Elaboration (Green, 2001).</Paragraph> <Paragraph position="1"> Second, there is not a one-to-one correspondence between form and function.</Paragraph> <Paragraph position="2"> For example, the same CMC can be used to indicate different coherence relations between a span of text and the named figure, e.g., Restatement and Evaluation. On the other hand, a relation of Summary can be indicated, for example, by 'From Fig. 9.5, you can see that' or '(see Figure 4)'. Another similarity is that CMCs are not always provided to mark explicitly the relationship obtaining between text and graphic. Research on discourse cue placement has framed our thinking on asking when and where to generate CMCs (DiEugenio, Moore and Paolucci, 1997).</Paragraph> <Paragraph position="3"> A multimedia presentation may include multimodal referring expressions, references to things in the world made through a combination of text and graphics (McKeown et al., 1992; Andre and Rist, 1994). Such cross-references are similar to cross-media cues in that they direct the user's attention to a related graphic. However, their function is different, namely, to enable the user to perform reference resolution. Another form of cross-reference, discourse deixis is the use of an expression that refers to part of the document containing it, e.g., 'the next chapter' (Paraboni and van Philadelphia, July 2002, pp. 42-45. Association for Computational Linguistics. Proceedings of the Third SIGdial Workshop on Discourse and Dialogue, Deemter, 1999). Although a user's interpretation of a cross-media cue may depend on discourse deixis to determine the graphic in question, the problem of selecting an appropriate description to refer to a graphic (e.g. 'Figure 4' versus 'the Figure below') is not a concern of our work at present.</Paragraph> <Paragraph position="4"> In our previous corpus study of multimedia arguments, we classified text in a document as either argument-bearing or commentarybearing, where the latter is text about a graphic included in the document (Green 2001). The topics of commentary-bearing text include the graphic's role in the argument (e.g. 'From Fig.</Paragraph> <Paragraph position="5"> 9.5, you can see that'), the interpretation of graphical elements in terms of the underlying domain and data, and salient visual features of the graphic. Furthermore, we noted that commentary-bearing and argument-bearing text may be interleaved, and that the ratio of the number of sentences of commentary to their related CMC may be many to one.</Paragraph> <Paragraph position="6"> Previous work in caption generation is relevant to the question of what kinds of things to say about accompanying graphics (Mittal et al., 1998; Fasciano and Lapalme, 1999).</Paragraph> <Paragraph position="7"> However, neither of those systems face the problem of integrating commentary-bearing text with text generated to achieve other presentation goals.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 Human-Computer Interaction </SectionTitle> <Paragraph position="0"> HCI research has focused on interaction techniques and features of layout that influence effectiveness. Use of contact points, control buttons in text on a web page that enable readers to control related animations (Faraday and Sutcliffe, 1999), is an interaction technique that, like CMCs, explicitly marks the relationship between information presented in two media. That paper provides experimental evidence that contact points improve comprehension of integrated text and animation.</Paragraph> <Paragraph position="1"> According to Moreno and Mayer's Spatial Contiguity Principle (2000), learning in multimedia presentations is improved when related text and graphics are spatially contiguous rather than separated. However, this does not imply that instead of providing CMCs a generator can rely on layout alone, for the following reasons. First, a generator may have responsibility for producing text but not have control over layout, e.g. when a document is displayed by a web browser.</Paragraph> <Paragraph position="2"> Second, a graphic may be relevant to multiple non-contiguous spans of text in a document.</Paragraph> </Section> </Section> class="xml-element"></Paper>