File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/w98-0210_metho.xml

Size: 10,916 bytes

Last Modified: 2025-10-06 14:15:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="W98-0210">
  <Title>A Media-Independent Content Language for Integrated Text and Graphics Generation</Title>
  <Section position="3" start_page="70" end_page="72" type="metho">
    <SectionTitle>
2 Content Language
</SectionTitle>
    <Paragraph position="0"> In order to ensure that the language would be applicable to a variety of quantitative domains, we first performed a corpus analysis, the results of which are summarized in the next section.</Paragraph>
    <Paragraph position="1"> Then we describe the syntax we adopted to satisfy the requirements given in the introduction.</Paragraph>
    <Section position="1" start_page="70" end_page="70" type="sub_section">
      <SectionTitle>
2.1 Corpus Analysis
</SectionTitle>
      <Paragraph position="0"> We have collected samples of presentations with integrated natural language and graphics in order to describe and analyze the vocabulary and structure of such presentations. To ensure generality, the corpus includes presentations from different disciplines (Economics and Medicine) and intended for different audiences. 4 It also includes samples from collections of presentations compiled by others, such as \[Tufte1983, Tufte1990, Tufte1997, Kosslyn1994\], and prescriptive examples found in books on how to design effective presentations \[Zelazny1996, Kosslyn1994\].</Paragraph>
      <Paragraph position="1"> The analysis of this corpus contributed directly to the development of a vocabulary for the content language. To describe the content of the presentations in the corpus, we distinguish three different sets of predicates with associated modifiers, as follows: 5</Paragraph>
      <Paragraph position="3"> Times.Comparison Predicates apply to any quantitative attribute of individuals or sets, e.g., On this measure Central Europe's stockmarkets are still puny compared with those of fast-growing Asian countries.</Paragraph>
      <Paragraph position="4"> * Global Predicates: \[widelylslightly\] Vary \[:from :to\], Constant.</Paragraph>
      <Paragraph position="5"> Global Predicates apply to quantitative attributes of sets, e.g., Sales representative performance is uneven.</Paragraph>
    </Section>
    <Section position="2" start_page="70" end_page="72" type="sub_section">
      <SectionTitle>
2.2 Syntax
</SectionTitle>
      <Paragraph position="0"> The first three requirements described in the Introduction (representing quantitative and temporal relations and aggregate properties, com- null positionality, and representing certain pragmatic distinctions) led us to make use of a first-order logic with restricted quantification (RQ-FOL), which has been used for representing the meaning of natural language queries involving complex referring expressions \[Woods1983, Webber1983\]. The features of RQFOL most useful for our purposes are (i) that it permits pragmatic distinctions to be made among expressions which are semantically equivalent, and (ii) that it supports the compositional specification of complex descriptions of discourse entities \[Webber1983\].</Paragraph>
      <Paragraph position="1"> A pragmatic distinction supported in RQFOL and our content language is the distinction between the main predication of an expression and information to be conveyed about the objects of the main predication. For example, although (la) and (lb) are semantically equivalent with (lc), they are not interchangeable in their effectiveness for achieving different communicative intentions (as was demonstrated in the Introduction.) In (la) the main predication is about news coverage, whereas in (lb) it is about newspaper circulation.</Paragraph>
      <Paragraph position="2"> (la) Three newspapers that are circulated in Pittsburgh carry only national news.</Paragraph>
      <Paragraph position="3"> (lb) Three newspapers that carry only national news are circulated in Pittsburgh.</Paragraph>
      <Paragraph position="4"> (lc) There is a set of three newspapers such that for every newspaper in the set, it is circulated in Pittsburgh and carries only national news.</Paragraph>
      <Paragraph position="5"> To represent this distinction in the content language, a communicative act has the form, (Act Proposition Referents), where Act specifies the type of action (such as Assert), Proposition is a quantifier-free FOL formula describing the main predication, and Referents is a list describing the arguments of the main predication. (It is assumed that the agent performing a communicative action is the system, and that the audience is the user.) For example, (la) and (Ib) can be analyzed ass realizing the assertions (2a) and (2b), respectively. In (2a), the main predication is (has-coverage ?dl National-only); the variable ?dl is further described as three newspapers that are circulated in Pittsburgh. 6 In (2b), the main predication is (has-circulation ?dI Pittsburgh); the variable ?dl is further described as three newspapers whose coverage is national news only.</Paragraph>
      <Paragraph position="6">  In general, each element of the Referents list has the form (term description), where term is a variable or a database object identifier; and term denotes a discourse entity. If provided, description specifies information about term that is required to achieve the goal(s) of the communicative act, as opposed to information whose only function is to enable the audience to identify the entity. Only descriptions with an attributive function are specified in the presentation plan. Referential descriptions, whose function is only to enable the audience to identify an entity, are constructed by the media-specific generators. (For information about the different roles of attributive and referential descriptions in our system, see \[Green et a1.1998\].) In general, description is of the form (for quantifier variable class restriction). (In (2a) and (2b), quantifier is the cardinal 3, the class is newspaper, and the restriction is (has-circulation ?z Pittsburgh) and (has-coverage ?x Nationalonly), respectively.) Complex descriptions can easily be expressed in a compositional manner in the content language. For example, (3a) is a possible realization in text of the assertion given in (3b). (A graphic realizing (3b) is shown in (3c) of Figure 1.) In (3b), the main predication, (gt ?dl ?d2), is that ?dl is greater than ?d2. ?dl is to be described as the unique integer ?x such that  that ?x is the total of ?d3; ?d3 is described as the unique set of integers ?y such that ?y is the number of readers of ?d4; and ?d4 is described as the elements of the set ($WSJ, $NYT, and $USA), (whose elements are database objects denoting the Wall Street Journal, the New York Times, and USA Today, respectively).</Paragraph>
      <Paragraph position="8"> The number of readers of the Post-Gazette is greater than the number of Pittsburgh readers of the New York Times, the Wall Street Journal, and USA Today combined.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="72" end_page="73" type="metho">
    <SectionTitle>
3 Examples
</SectionTitle>
    <Paragraph position="0"> In this section we illustrate how different communicative intentions about the same data can be represented in the content language, and how these intentions can be expressed in text and information graphics. One goal of this exercise is to illustrate what distinctions can be expressed graphically, but not what information should be expressed in graphics. (The problem of deciding which media to use, media allocation, is beyond the scope of this paper.) Thus, the examples of graphics are minimal in the sense that they have been designed to convey the information to be asserted and as little as possible other information. However, in some cases it is not possible not to convey more in graphics than was intended.</Paragraph>
    <Paragraph position="1"> For example in (3c) in Figure 1, which realizes (3b), the graphic also conveys information about relative numbers of readers of each of the newspapers, e.g., that the Post-Gazette has about one-third more than the sum of the others, and that the others have about the same number of readers each. Note that although it is not the communicative intention in (3b)  to convey the particular numbers of readers of each newspaper (hence the x-axis does not show actual numbers), information about the actual numbers of readers of each newspaper is needed during graphics generation to design (3c). (If the presentation's intention was to convey the particular numbers of readers of the newspapers, then different assertions specifying the actual numbers would be planned.) Whereas in (3b), four newspapers are individuated, it is possible to make an assertion such as (4b) in which the members of the set ($NAT) of newspapers with only national coverage are not individuated. The assertion in (4b) could be expressed in text as (4a), or in graphics in (4c) in Figure 1. However, this graphic still expresses more than (4b), e.g., that the number of PPG readers is about one-third more than the number of NAT readers (even though the x-axis does not show the actual numbers of readers).</Paragraph>
    <Paragraph position="2"> (4a) The number of readers of the Post-Gazette is greater than the total number of readers of the newspapers read in Pittsburgh with national coverage only.</Paragraph>
    <Paragraph position="3">  In contrast to (3b), (hb) differentiates the members of NAT, but does not identify or otherwise describe them. (hb) could be expressed in text as (ha), and in graphics as in (5 c) in Figure 1. Once again, the graphic has side-effects. In this case, it conveys additional information about the relative numbers of readers among the newspapers with national coverage only, and the fact that there are three of those newspapers. Comparing (5c) to (3c), in (3c) the total number of readers of the three other newspapers is expressed by concatenating segments of bars representing the three newspapers into a single bar whose length represents the total number of readers of the three newspapers. Although this information can be computed from (5c), it is not directly realized in the graphic.</Paragraph>
    <Paragraph position="4"> (Sa) The number of readers of the Post-Gazette is greater than the number of readers in Pittsburgh of any newspaper with national coverage only.</Paragraph>
    <Paragraph position="5">  In contrast to the preceding examples, (6b) illustrates a communicative intention (about the same data as in the other examples) with a different main predication. In text, (6b) could be expressed as in (6a); the main predication is about the coverage of the Post-Gazette rather than about the number of readers. This difference in main predication results in a graphic such as (6c) in Figure i with a different structure than those of the preceding examples. (6a) Only 1 of the newspapers read in Pittsburgh, the Post-Gazette, has both national and local coverage.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML