XML Viewer - p91-1018

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/91/p91-1018_metho.xml
Size: 12,658 bytes
Last Modified: 2025-10-06 14:12:50
<?xml version="1.0" standalone="yes"?>
<Paper uid="P91-1018">
  <Title>regier@cogsci.Berkeley.ED U * TR &amp;quot;Above&amp;quot; Figure 1: Learning to Associate Scenes with Spatial Terms</Title>
  <Section position="5" start_page="139" end_page="141" type="metho">
    <SectionTitle>
3 Learning Without Explicit Negative
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="139" end_page="139" type="sub_section">
      <SectionTitle>
Evidence
3.1 The Problem
</SectionTitle>
      <Paragraph position="0"> Researchers in child language acquisition have often observed that the child learns language apparently without the benefit of negative evidence \[Braine, 1971; Bowerman, 1983; Pinker, 1989\]. While these researchers have focused on the &amp;quot;no negative evidence&amp;quot; problem as it relates to the acquisition of grammar, the problem is a general one, and appears in several different aspects of language acquisition. In particular, it surfaces in the context of the learning of the semantics of lexemes for spatial relations. The methods used to solve the problem here are of general applicability, however, and are not restricted to this particular domain.</Paragraph>
      <Paragraph position="1"> The problem is best illustrated by example. Consider Figure 3. Given the landmark (labeled &amp;quot;LM&amp;quot;), the task is to learn the concept &amp;quot;above&amp;quot;. We have been given four positive instances, marked as small dotted circles in the figure, and no negative instances. The problem is that we want to generalize so that we can recognize new instances of &amp;quot;above&amp;quot; when they are presented, but since there are no negative instances, it is not clear where the boundaries of the region &amp;quot;above&amp;quot; the LM should be. One possible generalization is the white region containing the four instances. Another possibility is the union of that white region with the dark region surrounding the LM.</Paragraph>
      <Paragraph position="2"> Yet another is the union of the light and dark regions with the interior of the LM. And yet another is the correct one, which is not closed at the top. In the absence of negative examples, we have no obvious reason to prefer one of these generalizations over the others.</Paragraph>
      <Paragraph position="3"> One possible approach would be to take the smallest region that encompasses all the positive instances. It should be clear, however, that this will always lead to closed regions, which are incorrect characterizations of such spatial concepts as &amp;quot;above&amp;quot; and &amp;quot;outside&amp;quot;. Thus, this cannot be the answer.</Paragraph>
      <Paragraph position="4"> And yet, humans do learn these concepts, apparently in the absence of negative instances. The following sections indicate how that learning might take place.</Paragraph>
    </Section>
    <Section position="2" start_page="139" end_page="139" type="sub_section">
      <SectionTitle>
3.2 A Possible Solution and its Drawbacks
</SectionTitle>
      <Paragraph position="0"> One solution to the &amp;quot;no negative evidence&amp;quot; problem which suggests itself is to take every positive instance for one concept to be an implicit negative instance for all other spatial concepts being learned. There are problems with this approach, as we shall see, but they are surmountable.</Paragraph>
      <Paragraph position="1"> There are related ideas present in the child language literature, which support the work presented here. \[Markman, 1987\] posits a &amp;quot;principle of mutual exclusivity&amp;quot; for object naming, whereby a child assumes that each object may only have one name. This is to be viewed more as a learning strategy than as a hard-andfast rule: clearly, a given object may have many names (an office chair, a chair, a piece of furniture, etc.). The method being suggested really amounts to a principle of mutual exclusivity for spatial relation terms: since each spatial relation can only have one name, we take a positive instance of one to be an implicit negative instance for all others.</Paragraph>
      <Paragraph position="2"> In a related vein, \[Johnston and Slobin, 1979\] note that in a study of children learning locative terms in English, Italian, Serbo-Croatian, and qMrkish, terms were learned more quickly when there was little or no synonymy among terms. They point out that children seem to prefer a one-to-one meaning-to-morpheme mapping; this is similar to, although not quite the same as, the mutual exclusivity notion put forth here. 1 In linguistics, the notion that the meaning of a given word is partly defined by the meanings of other words in the language is a central idea of structuralism. This has been recently reiterated by \[MacWhinney, 1989\]: &amp;quot;the semantic range of words is determined by the particular contrasts in which they are involved&amp;quot;. This is consonant with the view taken here, in that contrasting words will serve as implicit negative instances to help define the boundaries of applicability of a given spatial term.</Paragraph>
      <Paragraph position="3"> There is a problem with mutual exclusivity, however.</Paragraph>
      <Paragraph position="4"> Using it as a method for generating implicit negative instances can yield many false negatives in the training set, i.e. implicit negatives which really should be positives.</Paragraph>
      <Paragraph position="5"> Consider the following set of terms, which are the ones  learned by the system described here: * above * below * Oil * off 1 They are not quite the same since a difference in meaning need not correspond to a difference in actual reference. When we call a given object both a &amp;quot;chair&amp;quot; and a &amp;quot;throne&amp;quot;, these are different meanings, and this would thus be consistent with a one-to-one meaning-to-morpheme mapping. It would not be consistent with the principle of mutual exclusivity, however. 140 * inside * outside * to the left of * to the right of  If we apply mutual exclusivity here, the problem of false negatives arises. For example, not all positive instances of &amp;quot;outside&amp;quot; are accurate negative instances for &amp;quot;above&amp;quot;, and indeed all positive instances of &amp;quot;above&amp;quot; should in fact be positive instances of &amp;quot;outside&amp;quot;, and are instead taken as negatives, under mutual exclusivity.</Paragraph>
      <Paragraph position="6"> &amp;quot;Outside&amp;quot; is a term that is particularly badly affected by this problem of false implicit negatives: all of the spatial terms listed above except for &amp;quot;in&amp;quot; (and &amp;quot;outside&amp;quot; itself, of course) will supply false negatives to the training set for &amp;quot;outside&amp;quot;.</Paragraph>
      <Paragraph position="7"> The severity of this problem is illustrated in Figure 4. In these figures, which represent training data for the spatial concept &amp;quot;outside&amp;quot;, we have tall, rectangular landmarks, and training points 2 relative to the landmarks. Positive training points (instances) are marked with circles, while negative instances are marked with X's. In (a), the negative instances were placed there by the teacher, showing exactly where the region not outside the landmark is. This gives us a &amp;quot;clean&amp;quot; training set, but the use of teacher-supplied explicit negative instances is precisely what we are trying to get away from. In (b), the negative instances shown were derived from positive instances for the other spatial terms listed above, through the principle of mutual exclusivity. Thus, this is the sort of training data we are going to have to use. Note that in (b) there are many false negative instances among the positives, to say nothing of the positions which have been marked as both positive and negative.</Paragraph>
      <Paragraph position="8"> This issue of false implicit negatives is the central problem with mutual exclusivity.</Paragraph>
    </Section>
    <Section position="3" start_page="139" end_page="141" type="sub_section">
      <SectionTitle>
3.3 Salvaging Mutual Exclusivity
</SectionTitle>
      <Paragraph position="0"> The basic idea used here, in salvaging the idea of mutual exclusivity, is to treat positive instances and implicit negative instances differently during training: Implicit negatives are viewed as supplying only weak negative evidence.</Paragraph>
      <Paragraph position="1"> The intuition behind this is as follows: since the implicit negatives are arrived at through the application of a fallible heuristic rule (mutual exclusivity), they should count for less than the positive instances, which are all assumed to be correct. Clearly, the implicit negatives should not be seen as supplying excessively weak negative evidence, or we revert to the original problem of learning in the (virtual) absence of negative instances.</Paragraph>
      <Paragraph position="2"> But equally clearly, the training set noise supplied by false negatives is quite severe, as seen in the figure above. So this approach is to be seen as a compromise, so that we can use implicit negative evidence without being overwhelmed by the noise it introduces in the training sets for the various spatial concepts.</Paragraph>
      <Paragraph position="3"> The details of this method, and its implementation under back-propagation, are covered in Section 5. However, 2I.e. trajectors consisting of a single point each</Paragraph>
      <Paragraph position="5"> this is a very general solution to the &amp;quot;no negative evidence&amp;quot; problem, and can be understood independently of the actual implementation details. Any learning method which allows for weakening of evidence should be able to make use of it. In addition, it could serve as a means for addressing the &amp;quot;no negative evidence&amp;quot; problem in other domains. For example, a method analogous to the one suggested here could be used for object naming, the domain for which Markman suggested mutual exclusivity.</Paragraph>
      <Paragraph position="6"> This would be necessary if the problem of false implicit negatives is as serious in that domain as it is in this one.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="141" end_page="141" type="metho">
    <SectionTitle>
4 Results
</SectionTitle>
    <Paragraph position="0"> This section presents the results of training.</Paragraph>
    <Paragraph position="1"> Figure 5 shows the results of learning the spatial term &amp;quot;outside&amp;quot;, first without negative instances, then using implicit negatives obtained through mutual exclusivity, but without weakening the evidence given by these, and finally with the negative evidence weakened.</Paragraph>
    <Paragraph position="2"> The landmark in each of these figures is a triangle.</Paragraph>
    <Paragraph position="3"> The system was trained using only rectangular landmarks. null The size of the black circles indicates the appropriateness, as judged by the trained system, of using the term &amp;quot;outside&amp;quot; to refer to a particular position, relative to the LM shown. Clearly, the concept is learned best when implicit negative evidence is weakened, as in (c).</Paragraph>
    <Paragraph position="4"> When no negatives at all are used, the system overgeneralizes, and considers even the interior of the LM to be &amp;quot;outside&amp;quot; (as in (a)). When mutual exclusivity is used, but the evidence from implicit negatives is not weakened, the concept is learned very poorly, as the noise from the false implicit negatives hinders the learning of the concept (as in (b)). Having all implicit negatives supply only weak negative evidence greatly alleviates the problem of false implicit negatives in the training set, while still enabling us to learn without using explicit, teacher-supplied negative instances.</Paragraph>
    <Paragraph position="5"> It should be noted that in general, when using mutual exclusivity without weakening the evidence given by implicit negatives, the results are not always identical with those shown in Figure 5(b), but are always of approximately the same quality.</Paragraph>
    <Paragraph position="6"> Regarding the issue of generalizability across LMs, two points of interest are that: * The system had not been trained on an LM in exactly this position.</Paragraph>
    <Paragraph position="7"> * The system had never been trained on a triangle of any sort.</Paragraph>
    <Paragraph position="8"> Thus, the system generalizes well to new LMs, and learns in the absence of explicit negative instances, as desired. All eight concepts were learned successfully, and exhibited similar generalization to new LMs.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML