File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/90/w90-0109_intro.xml

Size: 8,425 bytes

Last Modified: 2025-10-06 14:04:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W90-0109">
  <Title>Abstract Linguistic Resources for Text Planning</Title>
  <Section position="3" start_page="62" end_page="63" type="intro">
    <SectionTitle>
2. Linguistic Resources
</SectionTitle>
    <Paragraph position="0"> In this section, I address the question of what the linguistic resources are and what abstractions we can and should make over them. I begin by looking at the concrete resources, that is, those that actually appear in a stream of text. I then look at what various complexes of these resources express taken as a group. In Section 2.3, I look more generally at how work in linguistics can help develop a more complete vocabulary of abstractions.</Paragraph>
    <Section position="1" start_page="62" end_page="63" type="sub_section">
      <SectionTitle>
2.1 The concrete resources of language
</SectionTitle>
      <Paragraph position="0"> The concrete linguistic resources are all the syntactic structures, words, and grammatical features available to the speaker of the language. We can divide linguistic resources into two general classes: * The lexical resources: These are what are often called the open class words (the nouns, verbs, and adjectives), and they carry most of the content.</Paragraph>
      <Paragraph position="1"> * The grarnrnatical resources: These include the closed class words, morphological markings, and phrase structure.</Paragraph>
      <Paragraph position="2"> In what follows we ground the notion of concrete resources by looking closely at one fairly simple sentence: Karen likes watching movies.</Paragraph>
      <Paragraph position="3"> This sentence has lexical resources, such as &amp;quot;Karen&amp;quot; and &amp;quot;watch&amp;quot;, and morphological resources, such as &amp;quot;-ing&amp;quot;, the gerund marker on the verb &amp;quot;watch&amp;quot; which emphasizes the process aspect of the action, and &amp;quot;-s&amp;quot;, the plural marker on the noun &amp;quot;movie&amp;quot;. The phrase structure is also a concrete resource, which expresses how the constituents group together and certain kinds of relations between them; in this example, the phrase structure tells us that &amp;quot;movies&amp;quot; are what is watched, that &amp;quot;watching movies&amp;quot; is what is liked, and that &amp;quot;Karen&amp;quot; is the one that likes watching movies.</Paragraph>
      <Paragraph position="4"> What is not there also expresses information. The fact that there is no determiner Ca&amp;quot; or &amp;quot;the&amp;quot;) with &amp;quot;movies&amp;quot; indicates that it is not a particular set of movies being referred to (as in &amp;quot;the movies&amp;quot;) but a general sample of movies. Note that it is not just the lack of the determiner that provides this information, but the features of the whole constituent the particular noun is in and the fact that it is plural: if the noun phrase were singular, then there would have to be a determiner before &amp;quot;movie&amp;quot; (*&amp;quot;Karen likes watching movie&amp;quot;). For other nouns in head position, the lack of a determiner can mean other things. For example, there is also no determiner in the first noun phrase in the sentence CKaren&amp;quot;); however, in this case, since the head is a proper noun, it does refer to a unique individual. If a determiner is used with a proper noun, it has a more  general meaning of &amp;quot;an entity with that name&amp;quot; (as in &amp;quot;All the Karens I had ever met had dark hair and then I met a Karen with red hair&amp;quot;).</Paragraph>
      <Paragraph position="5"> We will term this kind of composition, where the same resource means different things in different contexts, &amp;quot;non-linear&amp;quot; composition; this is in contrast to &amp;quot;linear&amp;quot; composition, where each resource contributes an identifiable part of the whole and what it contributes is not context dependent. The identification of which grammatical resources non-linearly co-occur and grouping those sets into single/tbstract resources is a powerful method of constraining the text planner to keep its choices only those that are expressible in language, as we shall see in the next section where we develop abstraction resources for the sets of concrete resources that appear in the example.</Paragraph>
    </Section>
    <Section position="2" start_page="63" end_page="63" type="sub_section">
      <SectionTitle>
2.2 Abstractions over concrete resources
</SectionTitle>
      <Paragraph position="0"> Allowing a generation system to select concrete resources directly, as is done in virtually all other generation systems, makes available many more degrees of freedom than the language actually permits. As we saw in the previous section, some combinations of concrete resources occur in language, while others do not. Furthermore, we saw that the combination of the lexical resource in the head of a constituent and the grammatical resources in the constituent as a whole can combine non-linearly, so that the choice of the lexical and grammatical resources cannot be made independently of each other.</Paragraph>
      <Paragraph position="1"> In this section, we look at how we can abstract over combinations of concrete resources by treating a particular set as a whole and naming it, rather than treating the resources as a set of independent features that happen to have appeared together. The vocabulary of abstractions we derive then becomes the terms in which the text planner makes its decisions. It is incapable of selecting a set of resources that is not expressible because it is not allowed to choose them independently.</Paragraph>
      <Paragraph position="2"> For example, the two noun phrase constituents in our example CKaren&amp;quot; and &amp;quot;movies&amp;quot;), express two different perspectives on the entities they refer to. &amp;quot;Karen&amp;quot; is expressed with the perspective NAMED-INDIVIDUAL and &amp;quot;movies&amp;quot; is expressed as a &amp;quot;SAMPLE-OF-A-KIND&amp;quot;.2 We can think of these perspectives as semantic categories; &amp;quot;semantic&amp;quot; because they represent something about the meaning of a constituent, not just its form, e.g. &amp;quot;Karen&amp;quot; is referring to a person as a unique individual with a name, in contrast to referring to her as an anonymous individual (e.g. &amp;quot;a woman&amp;quot;). A surface constituent can then be abstractly represented in the Text 2 &amp;quot;Sample&amp;quot; intended to mean &amp;quot;indefinite set&amp;quot;; the choice of names for categories is meant to be evocative of what they mean, while staying away from terms that have special meanings in other theories. Within Text Structure, these terms only need to be consistent. The names themselves do no work, except to help the observer understand the system.</Paragraph>
      <Paragraph position="3"> Structure for the purposes of the text planner as the combination of a lexical item and a semantic category.</Paragraph>
      <Paragraph position="4"> Figure 1 shows the &amp;quot;abstract&amp;quot; resources for the two noun phrases of our example and the other constituents of our sentence: Karen Likes watching movies. 3 The upward arrows begin at the surface constituent being abstracted over and point to the boxes showing abstract resources: the lexical item in italics and semantic category following it. This tree of boxes is an example of the Text Structure intermediate level of representation. We will return to how to develop a complete set of semantic categories in the next section.</Paragraph>
      <Paragraph position="5"> In addition to abstracting over combinations of concrete resources by only representing the semantic type of a constituent, we can also represent the structural (syntactic) relations between the constituents. In Figure 1 the concrete relations of subject, direct object, etc. are represented abstractly as arguments and marked with a semantic relation. 4 In this example, we have identified three kinds of information that are essential to an abstract representation of the concrete resources language provides: * the constituency, * the semantic category of the constituent, and * the structural relations among the constituents.</Paragraph>
      <Paragraph position="6"> In the next section we look at some of the motivations for these abstractions, and in Section 3, show how they can be used for text planning.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML