File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-1414_concl.xml

Size: 2,495 bytes

Last Modified: 2025-10-06 13:57:57

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1414">
  <Title>report, German Research Center for Artificial Intelligence (DFKI).</Title>
  <Section position="10" start_page="0" end_page="0" type="concl">
    <SectionTitle>
5 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper a theory of representation and interpretation for multimodal messages, and a model for multimodal reference resolution has been presented. First, it was discussed how this problem can be stated in terms of so-called linguistic anaphor with pictorial antecedents, or pictorial anaphor with linguistic antecedents. It was argued that in more traditional lines such a problem can be thought of, alternativelly, as the resolution of spacial indexical referents. It was also argued that with a representational theory of modality, one in which the notion of modality is captured in terms of a formal language and its interpreter, a third interpretation of the problem of multimodal reference resolution can be given. In this last view, solving multimodal references can be thought of as inducing a translation function between basic constants of the modalities involved. The representation and interpretation machinery for carrying on this third view was formally developed along the lines of Montague's semiotic programme and its associated general theory of translation. It was also illustrated an algorithm for finding out such a translation relation when text and graphics are introduced through independent input channels, and the translation between constants must be induced dynamically. Finally, it was suggested to extend Kamp's DRT with multimodal discourse structures (MDRS) in order to model the referential aspects of the kind of multimodal discourse that is likely to occur in interactive multimodal A Model for Multimodal Reference Resolution 117 conversation. This extension would permit to capture an aspect of spacial deixis which is currently beyond the scope of DRT. If the notion of multimodal discourse representation structures is developed along the lines suggested in this paper, Kamp's demarcation between anaphoric and deictic uses of pronouns could be formally captured, as the sets of antecedents taken from the world would be incorparted as referents and conditions of a MDRS: while the antecedents for anaphoric pronouns taken from preceeding text are accesible for pronouns, deictic antecedents would be accesible via translation functions.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML