File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/w03-0901_intro.xml

Size: 9,893 bytes

Last Modified: 2025-10-06 14:01:52

<?xml version="1.0" standalone="yes"?>
<Paper uid="W03-0901">
  <Title>A Knowledge-Driven Approach to Text Meaning Processing</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 The Knowledge Base
</SectionTitle>
    <Paragraph position="0"> We have recently been working with text describing various kinds of &amp;quot;launch&amp;quot; events (launching satellites, products, Web sites, ships, etc.). We describe our ongoing implementation of the above approach in the context of these texts.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Architecture
</SectionTitle>
      <Paragraph position="0"> We envisage that, ultimately, the knowledge base (KB) will comprise a small number of abstract, core representations (e.g., movement, transportation, conversion, production, containment), along with a large number of detailed scenario representations. We anticipate that the former will have to be built by hand, while the latter can be acquired semi-automatically using a combination of text analysis and human filtering/assembling of fragments resulting from that analysis. At present, however, we are building both the core and detailed representations by hand, as a first step towards this goal.</Paragraph>
      <Paragraph position="1"> Each scenario representation contains a set of axioms describing the objects involved in the scenario, the events and subevents involved, and their relationships to each other. Before describing these in more detail, however, we first describe the KB's ontology (conceptual vocabulary). null</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 The Ontology: Concepts
</SectionTitle>
      <Paragraph position="0"> We are using WordNet (Miller et al., 1993) as the starting point for the KB's ontology. Although WordNet has limitations, it provides both an extensive taxonomy of concepts (synsets) and a rich mapping from those concepts to words/phrases that may be used to refer to them in text. This provides useful knowledge both for identifying coreferences between different representations that are known to relate (e.g., between a representation of &amp;quot;launching&amp;quot; and a representation of &amp;quot;moving&amp;quot;, where launching is defined as a type of moving), and also for matching scenario representations with text fragments when interpreting new text (Section 3.2). The use of WordNet may also make semi-automated construction of the scenario representations themselves easier, if the raw material for these representations is derived from text corpora. We are also adding new concepts where needed, in particular concepts that we wish to reify which are described by phrases rather than a single word (thus not in WordNet), e.g., &amp;quot;launch a satellite&amp;quot;, and correcting apparent errors or omissions that we find.</Paragraph>
      <Paragraph position="1"> As a naming convention, rather than identify a synset by its number we name it by concatenating the synset word most commonly used to refer to it (as specified by WordNet's tag statistics), its part of speech, and the WordNet sense of that word corresponds to that synset.</Paragraph>
      <Paragraph position="2"> For example, bank n1 is our friendly name for synset 106948080 (bank, the financial institution), as &amp;quot;bank&amp;quot; is the synset word most commonly used to refer to this synset, this synset is labeled with a noun part of speech, and &amp;quot;bank&amp;quot; sense 1 is synset 106948080. This renaming is a simple one-to-one mapping, and is purely cosmetic.</Paragraph>
      <Paragraph position="3"> In WordNet, verbs and their nominalizations are always treated as (members of) separate concepts, although from an ontological standpoint, these often (we believe) refer to the same entity (of type event). Martin has made a similar observation (Martin, 2003). An example is a running event, which may be referred to in both &amp;quot;I ran&amp;quot; and &amp;quot;the run&amp;quot;. To remove this apparent duplication, we use just the verb-based concept (synset) for these cases.</Paragraph>
      <Paragraph position="4"> Note that this phenomenon does not hold for all verbs; for some verbs, the nominalization may refer to the instrument (e.g., &amp;quot;hammer&amp;quot;) used in the event, the object (e.g., &amp;quot;drink&amp;quot;), the result (e.g., &amp;quot;plot&amp;quot;), etc.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 The Ontology: Relations
</SectionTitle>
      <Paragraph position="0"> For constructing scenario representations, we distinguish between active (action-like) verbs and stative (state-like) verbs (e.g., &amp;quot;enter&amp;quot; vs. &amp;quot;contain&amp;quot;), the former being reified as individuals in their own right (Davidsonian style) with semantic roles attached, while the latter are treated as relations1.</Paragraph>
      <Paragraph position="1"> For events, we relate the (reified) events to the objects which participate in those events (the &amp;quot;participants&amp;quot;) via semantic role-like relations (agent, instrument, employer, vehicle, etc.). We are following a fairly liberal approach to this: rather than confining ourselves to a small, fixed set of primitive relations, we are simply finding the Word-Net concept that best describes the relationship. This is partly in anticipation of the representations eventually being built semi-automatically from text, when a similarly diverse set of relations will be present (based on whatever relation the text author happened to use). In addition, it simply seems to be the case (we believe) that the set of possible relationships is large, making it hard to work with a small, fixed set without either overloading or excessively generalizing the meaning of relationships in that set.</Paragraph>
      <Paragraph position="2"> This eases the challenge that working with a constrained set of semantic roles poses, but at the expense of more work being required (by the reasoning engine) to determine coreference among representations. For example, if we use &amp;quot;giver&amp;quot; and &amp;quot;donor&amp;quot; (rather than &amp;quot;agent&amp;quot; and &amp;quot;agent&amp;quot;, say) as roles in &amp;quot;give&amp;quot; and &amp;quot;donate&amp;quot; representations respectively, and &amp;quot;donate&amp;quot; is a kind of &amp;quot;give&amp;quot;, it is then up to the inference engine to recognize that  so clean at the boundaries: whether something is an event or state is partly subjective, depending on the viewpoint adopted, e.g., the level of temporal granularity chosen. For example &amp;quot;flight&amp;quot; can be considered an event or a state, depending on the time-scale of interest.</Paragraph>
      <Paragraph position="3">  these probably refer to the same entity, which in turn requires additional world knowledge. We are currently using WordNet to provide this world knowledge. For example, in this case WordNet states that &amp;quot;donor&amp;quot; and &amp;quot;giver&amp;quot; are synonyms (in one synset), and hence the coreference can be recognized by the reasoning engine. In other cases one role concept may be a sub/supertype of the other.</Paragraph>
      <Paragraph position="4"> This decision also means that we are using some Word-Net concepts both as classes (types) and relations, thus strictly overloading these concepts. We are currently considering extending the naming convention to distinguish these.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Scenario Representations
</SectionTitle>
      <Paragraph position="0"> The scenario representations themselves are constructed - currently by hand - by identifying the key &amp;quot;participants&amp;quot; (both objects and events) in the scenario, and then creating a graph of relationships that normally exist between those participants. In our example of &amp;quot;launching&amp;quot; scenarios, each type of launching (launching a satellite, launching a product, etc.) is represented as a different scenario in the knowledge base. These representations are encoded in the language KM (Clark and Porter, 1999), a frame-based knowledge representation language with well-defined first-order logic semantics, similar in style to KRL. For example, a (simplified) KM representation of &amp;quot;launching a satellite&amp;quot; is shown in Figure 1, and sketched in Figure 2. In the graphical depiction, the dark node denotes a universally quantified object, other nodes denote implied, existentially quantified objects, and arcs denote binary relations. The semantics of this structure are that: for every launching a satellite event, there exists a rocket, a launch site, a countdown event, ... etc., and the rocket is the vehicle of the launching a satellite, the launch site is the location of the launching a satellite, etc. The KB currently contains approximately 25 scenario representations similar to this.</Paragraph>
      <Paragraph position="1"> These graphical representations are compositional in two important ways: First, through inheritance, a representation can be combined with representations of its generalizations (e.g., representations of &amp;quot;launching a satellite&amp;quot; and &amp;quot;placing something in position&amp;quot; can be combined). Second, different viewpoints/aspects of a concept such as launching a satellite are encoded as separate representational structures (e.g., the sequence of events; the temporal information; the spatial information; goal-oriented information). During text interpretation, only those representation(s) of aspects/views that the text itself refers to will be composed into the structure matched with the text.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML