File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-1001_intro.xml

Size: 3,765 bytes

Last Modified: 2025-10-06 14:06:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-1001">
  <Title>A Trainable Message Understanding System*</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
3 Generalization
</SectionTitle>
    <Paragraph position="0"> Rules created as a result of the Training Process are very specific and can only be applied to exactly the same patterns as the ones present during the training. In order to make the specific rules applicable to a large number of unseen articles in the domain, a comprehensive generalization mechanism is necessary. We are not only interested in the generalization itself, but also in the strategy to control the degree of generalization for various applications in different domains.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Degree of Generalization
</SectionTitle>
      <Paragraph position="0"> The hierarchical organization of WordNet (Miller, 1990) provides the possibility of automatic rule generalization of the rules. Philip Resnik has done some work earlier in using the hierarchical structure of WordNet (Resnik, 1995a) (Resnik, 19955).</Paragraph>
      <Paragraph position="1"> With the large amount of information on semantic classification and taxonomy provided in Word-Net, many ways of incorporating WordNet's semantic features with generalization are foreseeable. Although, at this stage, we only concentrate on the Hypernym/Hyponym feature.</Paragraph>
      <Paragraph position="2"> ~.From the training process, the specific rules contain three entities on the LHS as shown in Figure 3. Each entity is a quadruple, in the form of  sp = (w,e,s,t), where w is the headword of the trained phrase, c is the part of the speech of the word, s is the sense number representing the meaning of w, t is the semantic type identified by the pre-processor for w. An abstract specifi c rule is shown in Figure 4.</Paragraph>
      <Paragraph position="3"> For each sp = (w, e, s, t), if w exists in WordNet, then there is a corresponding synset in WordNet.</Paragraph>
      <Paragraph position="4"> The hyponym/hypernym hierarchical structure provides a way of locating the superordinate concepts of sp. By following additional hypernyms, we will get more and more generalized concepts and eventually reach the most general concept, such as {person, human being,...}. Based on this scenario, for each concept, different degrees of generalization can be achieved by adjusting the distance between this concept and the most general concept in the WordNet hierarchy. The function to accomplish this task is Generalize(sp, h), which returns a synset list h levels above the specific concept represented by sp in the hierarchy. An example is shown in Figure 5.</Paragraph>
      <Paragraph position="6"/>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Concept
3.2 Generalized Rules
</SectionTitle>
      <Paragraph position="0"> The process of generalizing rules consists of replacing each sp = (w, e, s, t) in the specific rules by a more general superordinate synset from its hypernym tree in WordNet by performing the Generalize(sp, h) function. The degree of generalization for rules varies with the variation of h in Generalize(sp, h).</Paragraph>
      <Paragraph position="1"> For example, Figure 6 shows the rule in Figure 3 generalized to two different degrees.</Paragraph>
      <Paragraph position="2">  C symbol signifies the subsumption relationship. Therefore, a C b signifies that a is subsumed by b, or, in WordNet terms, concept b is a superordinate concept of concept a. The generalized rule states that the RHS of the rule gets executed if all of the following conditions hold:</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML