File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/98/p98-1079_metho.xml

Size: 20,508 bytes

Last Modified: 2025-10-06 14:14:54

<?xml version="1.0" standalone="yes"?>
<Paper uid="P98-1079">
  <Title>A Text Understander that Learns</Title>
  <Section position="4" start_page="476" end_page="480" type="metho">
    <SectionTitle>
2 Methodological Framework
</SectionTitle>
    <Paragraph position="0"> In this section, we present the major methodological decisions underlying our approach.</Paragraph>
    <Section position="1" start_page="476" end_page="477" type="sub_section">
      <SectionTitle>
2.1 Terminological Logics
</SectionTitle>
      <Paragraph position="0"> We use a standard terminological, KL-ONEstyle concept description language, here referred to as C:DPS (for a survey of this paradigm, cf.</Paragraph>
      <Paragraph position="1"> Woods and Schmolze (1992)). It has several constructors combining atomic concepts, roles and individuals to define the terminological theory of a domain. Concepts are unary predicates, roles are binary predicates over a domain A, with individuals being the elements of A. We assume a common set-theoretical semantics for C7)PS - an interpretation Z is a function that assigns to each concept symbol (the set A) a subset of the domain A, Z : A -+ 2 n, to each role symbol (the set P) a binary relation of A,</Paragraph>
      <Paragraph position="3"> (the set I) an element of A, Z : I --+ A.</Paragraph>
      <Paragraph position="4"> Concept terms and role terms are defined inductively. Table 1 contains some constructors and their semantics, where C and D denote concept terms, while R and S denote roles. R z (d) represents the set of role fillers of the individual d, i.e., the set of individuals e with (d, e) E R z.</Paragraph>
      <Paragraph position="5"> By means of terminological axioms (for a subset, see Table 2) a symbolic name can be introduced for each concept to which are assigned necessary and sufficient constraints using the definitional operator '&amp;quot;= . A finite set of such axioms is called the terminology or TBox. Concepts and roles are associated with concrete individuals by assertional axioms (see Table 2; a, b denote individuals). A finite set of such axioms is called the world description or ABox. An interpretation Z is a model of an ABox with regard to a TBox, iff Z satisfies the assertional and terminological axioms.</Paragraph>
      <Paragraph position="6"> Considering, e.g., a phrase such as 'The switch of the Itoh-Ci-8 ..', a straightforward translation into corresponding terminological concept descriptions is illustrated by:</Paragraph>
      <Paragraph position="8"> Assertion P1 indicates that the instance switch.1 belongs to the concept class SWITCH.</Paragraph>
      <Paragraph position="9"> P2 relates Itoh-Ci-8 and switch.1 via the relation HAS-SWITCH. The relation HAS-SWITCH is defined, finally, as the set of all HAS-PART relations which have their domain restricted to the disjunction of the concepts OUTPUTDEV, INPUTDEV, STORAGEDEV or COMPUTER and their range restricted to SWITCH.</Paragraph>
      <Paragraph position="10"> In order to represent and reason about concept hypotheses we have to properly extend the formalism of C~PS. Terminological hypotheses, in our framework, are characterized by the following properties: for all stipulated hypotheses (1) the same domain A holds, (2) the same concept definitions are used, and (3) only different assertional axioms can be established. These conditions are sufficient, because each hypothesis is based on a unique discourse entity (cf. (1)), which can be directly mapped to associated instances (so concept definitions are stable (2)). Only relations (including the ISA-relation) among the instances may be different (3).</Paragraph>
      <Paragraph position="11">  Given these constraints, we may annotate each assertional axiom of the form 'a : C' and 'a R b' by a corresponding hypothesis label h so that (a : C)h and (a R b)h are valid terminological expressions. The extended terminological language (cf. Table 3) will be called CDPS ~y~deg. Its semantics is given by a special interpretation function Zh for each hypothesis h, which is applied to each concept and role symbol in the canonical way: Zh : A --+ 2zx; Zh : P --+ 2 AxA.</Paragraph>
      <Paragraph position="12"> Notice that the instances a, b are interpreted by the interpretation function Z, because there exists only one domain PSx. Only the interpretation of the concept symbol C and the role symbol R may be different in each hypothesis h.</Paragraph>
      <Paragraph position="13"> Assume that we want to represent two of the four concept hypotheses that can be derived from (P3), viz. Itoh-Ci-Sconsidered as a storage device or an output device. The corresponding ABox expressions are then given by:</Paragraph>
      <Paragraph position="15"> The semantics associated with this ABox fi'agment has the following form:</Paragraph>
      <Paragraph position="17"/>
    </Section>
    <Section position="2" start_page="477" end_page="479" type="sub_section">
      <SectionTitle>
2.2 Hypothesis Generation Rules
</SectionTitle>
      <Paragraph position="0"> As mentioned above, text parsing and concept acquisition from texts are tightly coupled.</Paragraph>
      <Paragraph position="1"> Whenever, e.g., two nominals or a nominal and a verb are supposed to be syntactically related in the regular parsing mode, the semantic interpreter simultaneously evaluates the conceptual compatibility of the items involved. Since these reasoning processes are fully embedded in a terminological representation system, checks are made as to whether a concept denoted by one of these objects is allowed to fill a role of the other one. If one of the items involved is unknown, i.e., a lexical and conceptual gap is encountered, this interpretation mode generates initial concept hypotheses about the class membership of the unknown object, and, as a consequence of inheritance mechanisms holding for concept taxonomies, provides conceptual role information for the unknown item.</Paragraph>
      <Paragraph position="2"> Given the structural foundations of terminological theories, two dimensions of conceptual learning can be distinguished -- the taxonomic one by which new concepts are located in conceptual hierarchies, and the aggregational one by which concepts are supplied with clusters of conceptual relations (these will be used subsequently by the terminological classifier to determine the current position of the item to be learned in the taxonomy). In the following, let target.con be an unknown concept denoted by the corresponding lexical item target.lex, base.con be a given knowledge base concept denoted by the corresponding lexical item base.lex, and let target.lex and base.lex be related by some dependency relation. Furthermore, in the hypothesis generation rules below variables are indicated by names with leading '?'; the operator TELL is used to initiate the creation of assertional axioms in C7)PS hypdeg.</Paragraph>
      <Paragraph position="3"> Typical linguistic indicators that can be exploited for taxonomic integration are appositions ('.. the printer @A@ .. '), exemplification phrases ('.. printers like the @A @ .. ') or nominal compounds ( '.. the @A @ printer .. 1. These constructions almost unequivocally determine '@A@' (target.lex) when considered as a proper name 1 to denote an instance of a PRINTER (target.con), given its characteristic dependency relation to 'printer' (base.lex), the conceptual correlate of which is the concept class PRINTER (base.con). This conclusion is justified independent of conceptual conditions, simply due to the nature of these linguistic constructions.</Paragraph>
      <Paragraph position="4"> The generation of corresponding concept hypotheses is achieved by the rule sub-hypo (Table 4). Basically, the type of target.con is carried over from base.con (function type-of). In addition, the syntactic label is asserted which characterizes the grammatical construction figuring as the structural source for that particular hy1Such a part-of-speech hypothesis can be derived from the inventory of valence and word order specifications underlying the dependency grammar model we use (BrSker et al., 1994).</Paragraph>
      <Paragraph position="6"> pothesis (h denotes the identifier for the selected hypothesis space), e.g., APPOSITION, EXEMPLI-FICATION, or NCOMPOUND.</Paragraph>
      <Paragraph position="7"> The aggregational dimension of terminological theories is addressed, e.g., by grammatical constructions causing case frame assignments.</Paragraph>
      <Paragraph position="8"> In the example '.. @B@ is equipped with 32 MB of RAM ..', role filler constraints of the verb form 'equipped' that relate to its PATIENT role carry over to '@B~'. After subsequent semantic interpretation of the entire verbal complex, '@B@' may be anything that can be equipped with memory. Constructions like prepositional phrases ( '.. @C@ from IBM.. ') or genitives ('..</Paragraph>
      <Paragraph position="9"> IBM's @C@ .. ~ in which either target.lex or base.lex occur as head or modifier have a similar effect. Attachments of prepositional phrases or relations among nouns in genitives, however, open a wider interpretation space for '@C~' than for '@B~', since verbal case frames provide a higher role selectivity than PP attachments or, even more so, genitive NPs. So, any concept that can reasonably be related to the concept IBM will be considered a potential hypothesis for '@C~-&amp;quot;, e.g., its departments, products, Fortune 500 ranking.</Paragraph>
      <Paragraph position="10"> Generalizing from these considerations, we state a second hypothesis generation rule which accounts for aggregational patterns of concept learning. The basic assumption behind this rule, perm-hypo (cf. Table 5), is that target.con fills (exactly) one of the n roles of base.con it is currently permitted to fill (this set is determined by the function porto-filler). Depending on the actual linguistic construction one encounters, it may occur, in particular for PP and NP constructions, that one cannot decide on the correct role yet. Consequently, several alternative hypothesis spaces are opened and target.co~ is assigned as a potential filler of the i-th role (taken from ?roleSet, the set of admitted roles) in its corresponding hypothesis space. As a result, the classifier is able to derive a suitable concept hypothesis by specializing target.con according to the value restriction of base.con's i-th role. The function member-of</Paragraph>
      <Paragraph position="12"> selects a role from the set ?roleSet; gen-hypo creates a new hypothesis space by asserting the given axioms of h and outputs its identifier. Thereupon, the hypothesis space identified by ?hypo is augmented through a TELL operation by the hypothesized assertion. As for sub-hypo, perm-hypo assigns a syntactic quality label (function add-label) to each i-th hypothesis indicating the type of syntactic construction in which target.lex and base.lex are related in the text, e.g., CASEFRAME, PPATTACH or GENITIVENP.</Paragraph>
      <Paragraph position="13"> Getting back to our example, let us assume that the target Itoh-Ci-8 is predicted already as a PRODUCT as a result of preceding interpretation processes, i.e., Itoh-Ci-8 : PRODUCT holds. Let PRODUCT be defined as:</Paragraph>
      <Paragraph position="15"> At this level of conceptual restriction, four roles have to be considered for relating the target Itoh-Ci-8 - as a tentative PRODUCT - to the base concept SWITCH when interpreting the phrase 'The switch of the Itoh-Ci-8 .. '. Three of them, HAS-SIZE, HAS-PRICE, and HAS-WEIGHT, are ruled out due to the violation of a simple integrity constraint ('switch'does not denote a measure unit). Therefore, only the role HAS-PART must be considered in terms of the expression Itoh-Ci-8 HAS-PART switch.1 (or, equivalently, switch.1 PART-OF Itoh-Ci-8). Due to the definition of HAS-SWITCH (cf. P3, Subsection 2.1), the instantiation of HAS-PART is specialized to HAS-SWITCH by the classifier, since the range of the HAS-PART relation is already restricted to SWITCH (P1). Since the classifier aggressively pushes hypothesizing to be maximally specific, the disjunctive concept referred to in  the domain restrictiou of the role HAS-SWITCH is split into four distinct hypotheses, two of which are sketched below. Hence, we assume Itoh-Ci-8 to deuote either a STORAGEDEvice or an OUTPUTDEvice or an INPUTDEvice or a COMPUTER (note that we also include parts of the IS-A hierarchy in the example below).</Paragraph>
    </Section>
    <Section position="3" start_page="479" end_page="480" type="sub_section">
      <SectionTitle>
2.3 Hypothesis Annotation Rules
</SectionTitle>
      <Paragraph position="0"> In this section, we will focus on the quality assessment of concept hypotheses which occurs at the knowledge base level only; it is due to the operation of hypothesis annotation rules which continuously evaluate the hypotheses that have been derived from linguistic evidence.</Paragraph>
      <Paragraph position="1"> The M-Deduction rule (see Table 6) is triggered for any repetitive assignment of the same role filler to one specific conceptual relation that occurs in different hypothesis spaces. This rule captures the assu,nption that a role filler which has been multiply derived at different occasions must be granted more strength than one which has been derived at a single occasion only.</Paragraph>
      <Paragraph position="2">  Considering our example at the end of subsection 2.2, for 'Itoh-Ci-8' the concept hypotheses STORAGEDEV and OUTPUTDEV were derived independently of each other in different hypothesis spaces. Hence, DEVICE as their common superconcept has been multiply derived by the classifier in each of these spaces as a result of transitive closure computations, too. Accordingly, this hypothesis is assigned a high degree of confidence by the classifier which derives the conceptual quality label M-DEDUCTION:</Paragraph>
      <Paragraph position="4"> The C-Support rule (see Table 7) is triggered whenever, within the same hypothesis space, a hypothetical relation, RI, between two instances can be justified by another relation, R2, involving the same two instances, but where the role fillers occur in 'inverted' order (R1 and R2 need not necessarily be semantically inverse relations, as with 'buy' and 'sell~. This causes the generation of the quality label C-SuPPORT which captures the inherent symmetry between concepts related via quasi-inverse relations.</Paragraph>
      <Paragraph position="5">  Whenever an already filled conceptual relation receives an additional, yet different role filler in the same hypothesis space, the Add-Filler rule is triggered (see Table 8). This application-specific rule is particularly suited to our natural language understanding task and has its roots in the distinction between mandatory and optio,lal case roles for (ACTION) verbs. Roughly, it yields a negative assessment in terms of the quality label ADDFILLER for any attempt to fill the same mandatory case role more than once (unless coordinations are involved). Iu contradistinction, when the same role of a non-ACTION concept (typically denoted by nouns) is multiply filled we assign the positive quality label SUPPORT, since it reflects the conceptual proximity a relation induces on its component fillers, provided that they share a common, non-ACTION concept class.</Paragraph>
      <Paragraph position="6">  We give examples both for the assignmeut of an ADDFILLER as well as for a SUPPORT label:</Paragraph>
    </Section>
    <Section position="4" start_page="480" end_page="480" type="sub_section">
      <SectionTitle>
2.4 Quality Dimensions
</SectionTitle>
      <Paragraph position="0"> The criteria from which concept hypotheses are derived differ in the dimension from which they are drawn (grammatical vs. conceptual evidence), as well as the strength by which they lend support to the corresponding hypotheses (e.g., apposition vs. genitive, multiple deduction vs. additional role filling, etc.). In order to make these distinctions explicit we have developed a &amp;quot;quality calculus&amp;quot; at the core of which lie the definition of and inference rules for quality labels (cf. Schnattinger and Hahn (1998) for more details). A design methodology for specific quality calculi may proceed along the following lines: (1) Define the dimensions from which quality labels can be drawn. In our application, we chose the set I:Q := {ll,..., Ira} of linguistic quality labels and CQ := {cl,...,c~} of conceptual quality labels. (2) Determine a partial ordering p among the quality labels from one dimension reflecting different degrees of strength among the quality labels. (3) Determine a total ordering among the dimensions.</Paragraph>
      <Paragraph position="1"> In our application, we have empirical evidence to grant linguistic criteria priority over conceptual ones. Hence, we state the following constraint: Vl E LQ, Vc E CQ : l &gt;p c The dimension I:Q. Linguistic quality labels reflect structural properties of phrasal patterns or discourse contexts in which unknown lexical items occur 2 -- we here assume that the type of grammatical construction exercises a particular interpretative force on the unknown item and, at the same time, yields a particular level of credibility for the hypotheses being derived. Taking the considerations from Sub-section 2.2 into account, concrete examples of high-quality labels are given by APPOSITION or NCOMPOUND labels. Still of good quality but already less constraining are occurrences of the unknown item in a CASEFRAME construction.</Paragraph>
      <Paragraph position="2"> Finally, in a PPATTACH or GENITIVENP construction the unknown lexical item is still less constrained. Hence, at the quality level, these latter two labels (just as the first two labels we considered) form an equivalence class whose elements cannot be further discriminated. So we end up with the following quality orderings: 2In the future, we intend to integrate additional types of constraints, e.g., quality criteria reflecting the degree of completeness vs. partiality of the parse.</Paragraph>
      <Paragraph position="3">  The dimension CQ. Conceptualquality labels result from comparing the conceptual representation structures of a concept hypothesis with already existing representation structures in the underlying domain knowledge base or other concept hypotheses from the viewpoint of structural similarity, compatibility, etc. The closer the match, the more credit is lent to a hypothesis. A very positive conceptual quality label, e.g., is M-DEDUCTION, whereas ADDFILLER is a negative one. Still positive strength is expressed by SUPPORT or C-SuPPORT, both being indistinguishable, however, from a quality point of view. Accordingly, we may state:</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="480" end_page="481" type="metho">
    <SectionTitle>
M-DEDUCTION &gt;p SUPPORT
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="480" end_page="481" type="sub_section">
      <SectionTitle>
2.5 Hypothesis Ranking
</SectionTitle>
      <Paragraph position="0"> Each new clue available for a target concept to be learned results in the generation of additional linguistic or conceptual quality labels. So hypothesis spaces get incrementally augmented by quality statements. In order to select the most credible one(s) among them we apply a two-step procedure (the details of which are explained in Schnattinger and Hahn (1998)). First, those concept hypotheses are chosen which have accumulated the greatest amount of high-quality labels according to the linguistic dimension PS:Q.</Paragraph>
      <Paragraph position="1"> Second, further hypotheses are selected from this linguistically plausible candidate set based on the quality ordering underlying CQ.</Paragraph>
      <Paragraph position="2"> We have also made considerable efforts to evaluate the performance of the text learner based on the quality calculus. In order to account for the incrementality of the learning process, a new evaluation measure capturing the system's on-line learning accuracy was defined, which is sensitive to taxonomic hierarchies. The results we got were consistently favorable, as our system outperformed those closest in spirit, CAMILLE (Hastings, 1996) and ScIsoR (Rau et  al., 1989), by a gain in accuracy on the order of 8%. Also, the system requires relatively few hypothesis spaces (2 to 6 on average) and prunes the concept search space radically, requiring only a few examples (for evaluation details, cf. Hahn and Schnattinger (1998)).</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML