XML Viewer - t78-1030

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/78/t78-1030_metho.xml
Size: 34,143 bytes
Last Modified: 2025-10-06 14:11:11
<?xml version="1.0" standalone="yes"?>
<Paper uid="T78-1030">
  <Title>ON REASONING BY DEFAULT</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ON REASONING BY DEFAULT
</SectionTitle>
    <Paragraph position="0"/>
  </Section>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
ABSTRACT
</SectionTitle>
    <Paragraph position="0"> This paper surveys a number of kinds of default reasoning in Artificial Intelligence, specifically, default assignments to variables, the closed world assumption, the frame default for causal worlds, exceptions as defaults, and negation in Artificial Intelligence programming languages.</Paragraph>
    <Paragraph position="1"> Some of these defaults provide clear representational and computational advantanges over their corresponding first order theories. Finally, the paper discusses various difficulties associated with default theories.</Paragraph>
    <Paragraph position="2"> If I don't know I don't know I think I know If I don't know I know</Paragraph>
    <Paragraph position="4"/>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
I. INTRODUCTION
</SectionTitle>
    <Paragraph position="0"> Default reasoning is commonly used in natural language understanding systems and in Artificial Intelligence in general. We use the term &amp;quot;default reasoning&amp;quot; to denote the process of arriving at conclusions based upon patterns of inference of the form &amp;quot;In the absence of any information to the contrary, assume...&amp;quot; In this paper, we take this pattern to have the more formal meaning &amp;quot;If certain information cannot be deduced from the given knowledge base, then conclude...&amp;quot; Such reasoning represents a form of plausible inference and is typically required whenever conclusions must be drawn despite the absence of total knowledge about a world.</Paragraph>
    <Paragraph position="1"> In order to fix some of these ideas, we begin by surveying a number of instances of default reasoning as they are commonly invoked in A.I.</Paragraph>
    <Paragraph position="2"> Specifically, we discuss default assignments to variables, the clo~ed world assumption, the frame default for causal worlds, exceptions as defaults, and negation in A.I. programming languages. We shall see that these may all be formalized by introducing a single default operator ~ where #W is taken to mean &amp;quot;W is not deducible from the given knowledge base&amp;quot;.</Paragraph>
    <Paragraph position="3"> In addition, we shall discover that the closed world and frame defaults provide clear representational and computational advantages over their corresponding first order theories. The former eliminates the need for an explicit representation of negative knowledge about a world, while the latter eliminates the so-called frame axioms for dynamic worlds.</Paragraph>
    <Paragraph position="4"> Finally, we discuss various problems which arise as a result of augmenting first order logic with a default operator.</Paragraph>
  </Section>
  <Section position="4" start_page="0" end_page="210" type="metho">
    <SectionTitle>
2. SOME INSTANCES OF DEFAULT REASONING IN A.I.
</SectionTitle>
    <Paragraph position="0"> The use of default reasoning in A.I. is far more widespread than is commonly realized. The purpose of this section is to point out a variety of seemingly different situations in which such reasoning arises, to accent common patterns which emerge when defaults are formalized, and to indicate certain representational and computational advantages of default reasoning.</Paragraph>
    <Section position="1" start_page="0" end_page="210" type="sub_section">
      <SectionTitle>
2.1 Default Assignments to Variables
</SectionTitle>
      <Paragraph position="0"> A number of knowledge representation schemes, e.g. FRL \[Roberts and Goldstein 1977\], KRL \[Bobrow and Winograd 1977\], explicitly provide for the assignment of default values to variables (slots, terminals). For example, in KRL the unit for a person in an airline travel system has the form:  We can view this declaration as an instruction to the KRL interpreter to carry out the following: If x is a person, then in the absence of any information to the contrary, assume hometown(x)=PaloAlto, or phrased in a way which makes explicit the fact that a default assignment is being made to a variable: If x is a person and no value can be determined for the variable y such that hometown(x)=y, then assume y=PaloAlto.</Paragraph>
      <Paragraph position="1"> Notice that in assigning a default value to a variable, it is not sufficient to fail to find an explicit match for the variable in the data base. For example, the non existence in the data base of a fact of the form hometown(JohnDoe)=y for some city y does not necessarily permit the default assignment y=PaloAlto. It might be the case that the following information is available:</Paragraph>
      <Paragraph position="3"> i.e. a person's hometown is the same as his or her employer. In this case the default assignment y=PaloAlto can be made only if we fail-to deduce the existence of an employer x and city z such that EMPLOYS(x,JohnDoe) A location(x)=z In general then, default assignments to variables are permitted only as a result of failure of some attempted deduction. We can formulate a general inference pattern for the default assignment of values to variables: For all x I ..... x n in classes T 1 ..... Tn respectively, if we fail to deduce (Ey/e)P(x I ..... Xn,Y) then infer the default statement i Throughout this paper we shall use a typed logical representation language. Types, e.g. EMPLOYER, PERSON, CITY correspond to the usual categories of IS-A hierarchies. A typed universal quantifier like (x/EMPLOYER) is read &amp;quot;for all x which belong to the class EMPLOYER&amp;quot; or simply &amp;quot;for all employers x&amp;quot;. A typed existential quantifier like (Ex/CITY) is read &amp;quot;there is a city x&amp;quot;. The notation derives from that used by Woods in his &amp;quot;FOR function&amp;quot; \[Woods 1968\].</Paragraph>
      <Paragraph position="5"> Here ~ is to be read &amp;quot;fail to deduce&amp;quot;, e and the T's are types, and P(x I ..... Xn,Y) is any statement about the variables x I ..... Xn,Y. There are some serious difficulties associated with just what exactly is meant by &amp;quot; ~&amp;quot; but we shall defer these issues for the moment and rely instead on the reader's intuition. The default rule for home towns can now be seen as an instance of the above pattern:</Paragraph>
      <Paragraph position="7"/>
    </Section>
    <Section position="2" start_page="210" end_page="210" type="sub_section">
      <SectionTitle>
2.2 THE CLOSED WORLD ASSUMPTION
</SectionTitle>
      <Paragraph position="0"> It seems not generally recognized that the reasoning components of many natural language understanding systems have default assumptions built into them. The representation of knowledge upon which the reasoner computes does not explicitly indicate certain default assumptions. Rather, these defaults are realized as part of the code of the reasoner, or, as we shall say, following \[Hayes 1977\], as part of the reasoner's process structure. The most common such default corresponds to what has elsewhere been referred to as the closed world assumption \[Reiter 1978\]. In this section we describe two commonly used closed world defaults.</Paragraph>
      <Paragraph position="1">  As an illustration of the class of closed world defaults, consider standard taxonomies (IS-A hierarchies) as they are usually represented in the A.I. literature, for example the following:</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="210" end_page="212" type="metho">
    <SectionTitle>
THING
ANIMATE INANIMATE
MAMMAL REPTILE
DOG CAT
</SectionTitle>
    <Paragraph position="0"> This has, as its first order logical representation, the following:</Paragraph>
    <Paragraph position="2"> that Fido is animate in either of two essentially isomorphic ways: I. If the hierarchy is implemented as some sort of network, then we infer ANIMATE(fido) if the class ANIMATE lies &amp;quot;above&amp;quot; DOG i.e. there is some pointer chain leading from node DOG to node ANIMATE in the network.</Paragraph>
    <Paragraph position="3"> 2. If the hierarchy is implemented as a set of first order formulae, then we conclude ANIMATE(fido) if we can forward chain (modus ponens) with DOG(fido) to derive ANIMATE(fido). This forward chaining from DOG(fido) to ANIMATE(fido) corresponds exactly to following pointers from node DOG to node ANIMATE in the network.</Paragraph>
    <Paragraph position="4"> Thus far, there is no essential difference between a network representation of a hierarchy with its pointer-chasing interpreter and a first order representation with its forward chaining theorem proving interpreter. A fundamental distinction arises with respect to negation. As an example, consider how one deduces that Fido is not a reptile. A network interpreter will determine that the node REPTILE does not lie &amp;quot;above&amp;quot; DOG and will thereby conclude that DOGs are not REPTILEs so that ~REPTILE(fido) is deduced. On the other hand, a theorem prover will try to prove ~REPTILE(fido).</Paragraph>
    <Paragraph position="5"> Given the above first order representation, no such proof exists. The reason is clear - nothing in the representation (2.1) states that the categories MAMMAL and REPTILE are disjoint. For the theorem prover to deal with negative information, the knowledge base (2.1) must be augmented by the following facts stating that the categories of the hierarchy are disjoint:</Paragraph>
    <Paragraph position="7"> It is now clear that a first order theorem proving interpreter can establish ~REPTILE(fido) by a pure forward chaining proof procedure from DOG(fido) using (2.1) and (2.2). However, unlike the earlier proof of ANIMATE(fido), this proof of~REPTILE(fido) is not isomorphic to that 9enerated by the network interpreter. (Recall that the network interpreter deduces ~REPTILE(fido) by failing to find a pointer chain linking DOG and REPTILE). Moreover, while the network interpreter must contend only with a representation equivalent to that of (2.1), the theorem prover must additionally utilize the negative information (2.2). Somehow, then, the process structure of the network interpreter implicitly represents the negative knowledge (2.2), while computing only on declarative knowledge equivalent to (2.1).</Paragraph>
    <Paragraph position="8"> We can best distinguish the two approaches by observing that two different logics are involved.</Paragraph>
    <Paragraph position="9"> To see this, consider modifying the theorem prover so as to simulate the network process structure.</Paragraph>
    <Paragraph position="10"> Since the network interpreter tries, and fails, to establish a pointer chain from DOG to REPTILE using a declarative knowledge base equivalent to (2.1), the theorem prover can likewise attempt to prove REPTILE(fido) using only (2.1). As for the network interpreter, this attempt will fail. If we now endow the theorem prover with the additional inference rule: &amp;quot;If you fail to deduce REPTILE(fido) then conclude ~REPTILE(fido)&amp;quot; the deduction of ~REPTILE(fido) will be isomorphic to that of the network interpreter. More generally, we require an inference schema, applicable to any of the monadic predicates MAMMAL, DOG, CAT, etc. of the hierarchy: &amp;quot;If x is an individual and P(x) cannot be deduced, then infer ~P(x)&amp;quot; or in the notation of the previous section</Paragraph>
    <Paragraph position="12"> What we have argued then is that the process structure of a network interpreter is formally equivalent to that of a first order theorem prover augmented by the ability to use the inference schema (D2). In a sense, a network interpreter is the compiled form of such an augmented theorem prover.</Paragraph>
    <Paragraph position="13"> There are several points worth noting: I. The schema (D2) is not a first order rule of inference since the operator ~ is not a first order notion. (It is a meta notion.) Thus a theorem  prover which evokes (D2) in order to establish negative conclusions by failure is not performing first order deductions.</Paragraph>
    <Paragraph position="14">  2. The schema (D2) has a similar pattern to the default schema (DI).</Paragraph>
    <Paragraph position="15"> 3. In the presence of the default schema (D2),  the negative knowledge (2.2), which would be necessary in the absence of (D2), is not required. As we shall see in the next section, this property is a general characteristic of the closed world default, and leads to a significant reduction in the complexity of both the representation and processing of knowledge.</Paragraph>
    <Paragraph position="16">  The schema (D2) is actually a special case of the following more general default schema:</Paragraph>
    <Paragraph position="18"> If (D3) is in force for all predicates P of some domain, then reasoning is being done under the closed world assumption \[Reiter 1978\]. In most A.l. representation schemes, hierarchies are treated as closed world domains. The use of the closed world assumption in A.l. and in ordinary human reasoning extends beyond such hierarchies, however. As a simple example, consider an airline schedule for a direct Air Canada flight from Vancouver to New York. If none is found, one assumes that no such flight exists. Formally, we can view the schedule as a data base, and the query as an attempt to establish DIRECTLY-CONNECTS(AC, Van,NY). This fails, whence one concludes ~DIRECTLY-CONNECTS(AC,Van,NY) by an application of schema (D3). Such schedules are designed to be used under the closed world assumption. They contain only positive information; negative information is inferred by default. There is one very good reason for making the closed world assumption in this setting. The number of negative facts vastly exceeds the number of positive ones. For example, Air Canada does not directly connect Vancouver and Moscow, or Toronto and Bombay, or Moscow and Bombay, etc. etc. It is totally unfeasible to explicitly represent all such negative information in the data base, as would be required under a first order theorem prover, It is important to notice, however, that the closed world assumption presumes perfect knowledge about the domain being modeled. If it were not known, for example, whether Air Canada directly connects Vancouver and Chicago, we would no longer be justified in making the closed world assumption with respect to the flight schedule. For by the absence of this fact from the data base, we would conclude that Air Canada does not directly connect Vancouver and Chicago, violating our assumed state of ignorance about this fact.</Paragraph>
    <Paragraph position="19"> The flight schedule illustrates a very common use of the closed world default rule for purely extensional data bases. In particular, it illustrates how this default factors out the need for any explicit representation of negative facts.</Paragraph>
    <Paragraph position="20"> This result holds for more general data bases. As an example, consider the ubiquitous blocks world, under the following decomposition hierarchy of objects in that world:</Paragraph>
  </Section>
  <Section position="6" start_page="212" end_page="214" type="metho">
    <SectionTitle>
OBJECT
BLOCK TABLE
CUBE PYRAMID
</SectionTitle>
    <Paragraph position="0"> Let SUPPORTS(x,y) denote &amp;quot;x directly supports y&amp;quot; and FREE(x) denote &amp;quot;x is free&amp;quot; i.e. objects may be placed upon x. Then the following general facts hold:</Paragraph>
    <Paragraph position="2"> Notice that virtually all of the knowledge about the blocks domain is negative, namely the negative specific facts (11), together with the negative facts(1)-(7) I. This is not an accidental feature.</Paragraph>
    <Paragraph position="3"> Most of what we know about any world is negative.</Paragraph>
    <Paragraph position="4"> Now a first order theorem prover must have access to all of the facts (1)-(ll). For example, in proving~SUPPORTS(C3,C2) it must use (4). Consider instead such a theorem prover endowed with the additional ability to interpret the closed world default schema (D3). Then, in attempting a proof of ~SUPPORTS(C3,C2) it tries to show that SUPPORTS(C3,C2) is not provable. Since SUPPORTS(C3,C2) cannot be proved, it concludes ~SUPPORTS(C3,C2), as required.</Paragraph>
    <Paragraph position="5"> It should be clear intuitively that in the presence of the closed world default schema (D3), none of the negative facts (I)-(7), (11) need be represented explicitly nor used in reasoning. This can be proved, under fairly general condition~ \[Reiter 1978\]. One function, then, of the closed world default is to &amp;quot;factor out&amp;quot; of the representation all negative knowledge about the domain. It is of some interest to compare the blocks world representation (1)-(ll) with those commonly used in blocks world problem-solvers (e.g.\[Winograd 1972, Warren 1974\]). These systems do not represent explicitly the negative knowledge (I)-(7), (ll) but instead usethe closed world default for reasoning about negation. (See Section 3 below for a discussion of negation in A.I. programming languages.) Although the closed world default factors out negative knowledge for answering questions about a domain, this knowledge must nevertheless be availi The nOtion of a negative fact has a precise definition. A fact is negative iff all of the literals in its clausal form are negative.</Paragraph>
    <Paragraph position="6"> able. To see why, consider an attempted update of the example blocks world scene with the new &amp;quot;fact&amp;quot; SUPPORTS(C3,C2). To detect the resulting inconsistency requires the negative fact (4). In general then, negative knowledge is necessary for maintaining the integrity of a data base. A consequence of the closed world assumption is a decomposition of knowledge into positive and negative facts. Only positive knowledge is required for querying the data base. Both positive and negative knowledge are required for maintaining the integrity of the data base.</Paragraph>
    <Section position="1" start_page="212" end_page="214" type="sub_section">
      <SectionTitle>
2.3 DEFAULTS AND THE FRAME PROBLEM
</SectionTitle>
      <Paragraph position="0"> The frame problem \[Raphael 1971\] arises in the representation of dynamic worlds. Roughly speaking, the problem stems from the need to represent those&amp;quot; aspects of the world which remain invariant under certain state changes. For example, moving a particular object or switching on a light will not change the colours of any objects in the world.</Paragraph>
      <Paragraph position="1"> Painting an object will not affect the locations of the objects. In a first order representation of such worlds, it is necessary to represent explici~y all of the invariants under all state changes.</Paragraph>
      <Paragraph position="2"> These are referred to as the frame axioms for the world being modeled. For example, to represent the fact that painting an object does not alter the locations of objects would require, in the situational calculus of \[McCarthy and Hayes 1969\] a frame axiom something like</Paragraph>
      <Paragraph position="4"> The problem is that in general we will require a vast number of such axioms e.g. object locations also remain invariant when lights are switched on, when it thunders, when someone speaks etc. so there is a major difficulty in even articulating a deductively adequate set of frame axioms for a given world.</Paragraph>
      <Paragraph position="5"> A solution to the frame problem is a representation of the world coupled with appropriate rules of inference such that the frame axioms are neither represented explicitly nor used explicitly in reasoning about the world. We will focus on a  proposed solution by \[Sandewall 1972\] 1 . A related approach is described in \[Hayes 1973\]. Sandewall proposes a new operator, UNLESS, which takes formula W as argument. The intended interpretation of UNLESS(W) is &amp;quot;W can not be proved&amp;quot; i.e. it is identical to the operator F/ of this paper.</Paragraph>
      <Paragraph position="6"> Sandewall proposes a single &amp;quot;frame inference rule&amp;quot; which, in the notation of this paper, can be paraphrased as follows: For all predicates P which take a state variable as an argument</Paragraph>
      <Paragraph position="8"> Intuitively, (D4) formalizes the so-called &amp;quot;STRIPS assumption&amp;quot; \[Waldinger 1975\]: Every action (state change) is assumed to leave every relation unaffected unless it is possible to deduce otherwise.</Paragraph>
      <Paragraph position="9"> This schema can be used in the following way, say in order to establish that cube33 is at location after box7 has been painted blue: To establish LOCATlON(cube33,~,paint(box7,blue,s)) fail to prove~LOCATlON(cube33,~,paint(box7,blue,s)) There are several observations that can be made: I. The frame inference schema (D4) has a pattern similar to the default schemata (D2) and (D3) of earlier sections of this paper. It too is a default schema.</Paragraph>
      <Paragraph position="10"> 2. The frame schema (D4) is in some sense a dual of the closed world schema (D3). The former permits the deduction of a positive fact from failure to establish its negation. The latter provides for the deduction of a negative fact from failure to derive its positive counterpart. This duality is preserved with respect to the knowledge &amp;quot;factored out&amp;quot; of the representation. Whereas the frame default eliminates the need for certain kinds of positive knowledge (the frame axioms), the closed world default factors out the explicit representation of negative knowledge.</Paragraph>
    </Section>
    <Section position="2" start_page="214" end_page="214" type="sub_section">
      <SectionTitle>
2.4 DEFAULTS AND EXCEPTIONS
</SectionTitle>
      <Paragraph position="0"> A good deal of what we know about the world is  &amp;quot;almost always&amp;quot; true, with a few exceptions. For example, all birds fly except for penguins, ostriches, fledglings, etc. Given a particular bird, we will conclude that it flies unless we happen to know that is satisfies one of these exceptions. Nevertheless, we want it true of birds &amp;quot;in general&amp;quot; that they fly. How can we reconcilethese apparently conflicting points of view? The natural first order representation is inconsistent: (x/BIRD)FLY(x) &amp;quot;In general, birds fly&amp;quot; (x)PENGUIN(x) ~ BIRD(x)&amp;quot;Penguins are birds (x/PENGUIN)~FLY(x) which don't fly.&amp;quot; An alternative first order representation explicitly lists the exceptions to flying</Paragraph>
      <Paragraph position="2"> But with this representation, we cannot conclude of a &amp;quot;general&amp;quot; bird, that it can fly. To see why, consider an attempt to prove FLY(tweety) where all we know of tweety is that she is a bird. Then we must establish the subgoal -PENGUIN(tweety) ^ ~OSTRICH(tweety) ^...</Paragraph>
      <Paragraph position="3"> which is impossible given that we have no further information about tweety. We are blocked from concluding that tweety can fly even though, intuitively we want to deduce just that. In effect, we need a default rule of the form</Paragraph>
      <Paragraph position="5"> With this rule of inference we can deduce FLY(tweety), as required. Notice, however, that whenever there.are exceptions to a &amp;quot;general&amp;quot; fact in some domain of knowledge we are no longer free to arbitrarily structure that knowledge. For example, the following hierarchy would be unacceptable, where the dotted link indicates the existence of an exception</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="214" end_page="215" type="metho">
    <SectionTitle>
ANIMAL
FLY CRAWL
BAT BIRD
PENGUIN ROBIN
</SectionTitle>
    <Paragraph position="0"> Clearly there is no way in this hierarchy of establishing that penguins are animals. For h4erarchies the constraint imposed by exceptions is easily  articulated: If P and Q are nodes with P below Q, and if (x)P(x) p Q(x) is true without exception, then there must be a sequence of solid links connecting P and Q. For more general kinds of knowledge the situation is more problematic. One must be careful to ensure that chains of implications do not unwittingly inherit unintended exceptions.</Paragraph>
  </Section>
  <Section position="8" start_page="215" end_page="215" type="metho">
    <SectionTitle>
3. DEFAULTS AND &amp;quot;NEGATION&amp;quot; IN A.I.
PROGRAMMING LANGUAGES
</SectionTitle>
    <Paragraph position="0"> It has been observed by several authors \[Hayes 1973, Sandewall 1972, Reiter 1978\] that the basicdefault operator ~ has,as its&amp;quot;procedural equivalent&amp;quot; the negation operator in a number of A.I. programming languages e.g. THNOT in MICROPLANNER \[Hewitt 1972, Sussman et al.1970\],.NOT in PROLOG \[Roussel 1975\].</Paragraph>
    <Paragraph position="1"> For example, in MICROPLANNER, the command (THGOAL &lt;pattern&gt;) can be viewed as an attempt to prove &lt;pattern&gt; given a data base of facts and theorems. (THNOT(THGOAL &lt;pattern&gt;)) then succeeds iff (THGOAL &lt;pattern&gt;) fails i.e. iff &lt;pattern&gt; is not provable, and this of course is precisely the interpretation of the default operator ~ .</Paragraph>
    <Paragraph position="2"> Given that &amp;quot;negation&amp;quot; in A.I. procedural languages corresponds to the default operator and not to logical negation, it would seem that some of the criticism often directed at theorem proving from within the A.I. community is misdirected. For the so-called procedural approach, often proposed as an alternative to theorem proving as a representation and reasoning component in A.I. systems, is a realization of a default logic, whereas theorem provers are usually realizations of a first order logic, and as we have seen, these are different logics.</Paragraph>
    <Paragraph position="3"> In a sense, the so-called procedural vs.</Paragraph>
    <Paragraph position="4"> declarative issue in A.I, might better be phrased as the default vs. first order logic issue. Many of the advantages of the procedural approach can be interpreted as representational and computational advantages of the default operator. There is a fair amount of empirical evidence in support of this point of view, primarily based upon the successful use of PROLOG \[Roussel 1975\] - a pure theorem prover augmented with a &amp;quot;THNOT&amp;quot; operator for such diverse A.I. tasks as problem solving \[Warren 1974\], symbolic mathematics \[Kanoui 1976\], and natural language question-answering \[Colmeraurer 1973\].</Paragraph>
    <Paragraph position="5"> On the theoretical level, we are just beginning to understand the advantages of a first order logic augmented with the default operator: i. Default logic provides a representation language which more faithfully reflects a good deal of common sense knowledge than do traditional logics.</Paragraph>
    <Paragraph position="6"> Similarly, for many situations, default reasoning corresponds to what is usually viewed as common sense reasoning.</Paragraph>
    <Paragraph position="7"> 2. For many settings, the appropriate default theories lead to a significant reduction in both representational and computational complexity with respect to the corresponding first order theory.</Paragraph>
    <Paragraph position="8"> Thus, under the closed world default, negative knowledge about a domain need not explicitly be represented nor reasoned with in querying a data base. Similarly under the frame default, the usual frame axioms are not required.</Paragraph>
    <Paragraph position="9"> There are, of course, other advantages of the procedural approach - specifically, explicit control over reasoning - which are not accounted for by the above logical analysis. We have distinguished the purely logical structure of such representational languages from their process structure, and have argued that at least some of their success derives from the nature of the logic which they realize.</Paragraph>
  </Section>
  <Section position="9" start_page="215" end_page="216" type="metho">
    <SectionTitle>
4. SOME PROBLEMS WITH DEFAULT THEORIES
</SectionTitle>
    <Paragraph position="0"> Given that default reasoning has such wide-spread applications in A.I. it is natural to define a default theory as a first order theory augmented by one or more inference schemata like (Dl), (D2) etc. and to investigate the properties of such theories. Unfortunately, some such theories display peculiar and intuitively unacceptable behaviours.</Paragraph>
    <Paragraph position="1"> One difficulty is the ease with which incon~A sistent theories can be defined, for example B coupled with a knowledge base with the single fact IB. Another, pointed out by \[Sandewall 1972\] is that the theorems of certain default theories will depend upon the order in which they are derived. As an example, consider the theory  is now proved, we cannot infer A, so this theory has the single theorem B. If instead, we had started by observing that B is not provable, then the theory would have the single theorem A. Default theories exhibiting such behaviour are clearly unacceptable. At the very least, we must demand of a default theory that it satisfy a kind of Church-Rosser property: No matter what the order in which the theorems of the theory are derived, the resulting set of theorems will be unique.</Paragraph>
    <Paragraph position="2"> Another difficulty arises in modeling dynamically changing worlds e.g. in causal worlds or in text understanding where the model of the text being built up changes as more of the text is assimilated. Under these circumstances, inferences which have been made as a result of a default assumption may subsequently be falsified by new information which now violates that default assumption. As a simple example, consider a travel consultant which has made the default assumption that the traveller's starting point is Palo Alto and has, on the basis of this, planned all of the details of a trip. If the consultant subsequently learns that the starting point is Los Angeles, it must undo at least part of the planned trip, specifically the first (and possibly last) leg of the plan. But how is the consultant to know to focus just on these changes? Somehow, whenever a new fact is deduced and stored in the data base, all of the facts which rely upon a default assumption and which supported this deduction must be associated with this new fact. These supporting facts must themselves have their default supports associated with them, and so on. Now, should the data base be updated with new information which renders an instance of some default rule inapplicable, delete all facts which had been previously deduced whose.support sets relied upon this instance of the default rule.</Paragraph>
    <Paragraph position="3"> There are obviously some technical and implementation details that require articulation, but the basic idea should be clear. A related proposal for dealing with beliefs and real world observations is described in \[Hayes 1973\].</Paragraph>
    <Paragraph position="4"> One way of viewing the role of a default theo~ is as a way of implicitly further completing an underlying incomplete first order theory. Recall that a first order theory is said to be complete iff for all closed formulae W, wither W or ~W is provable. Most interesting mathematical theories turn out to be incomplete - a celebrated result due to Godel. Most of what we know about the world, when formalized, will yield an incomplete theory precisely because we cannot know everything - there are gaps in our knowledge. The effect of a default rule is to implicitly fill in some of those gaps by a form of plausible reasoning. In particular, the effect of the closed world default is to fully complete an underlying incomplete first order theory. However, it is well known that there are insurmountable problems associated with completing an incomplete theory like arithmetic. Although it is a trivial matter conceptually to augment the axioms of arithmetic with a default rule --~ where W is any closed formula, we will be no further ahead because the non theorems of arithmetic are not recursively enumerable. What this means is that there is no way in general that, given a W, we can establish that W is not a theorem even if W happens not to be a theorem. This in turn means that we are not even guaranteed that an arbitrary default rule of inference is effective i.e. there may be no algorithm which will inform us whether or not a given default rule of inference is applica~e~ From this we can conclude that the theories of a default theory may not be recursively enumerable.</Paragraph>
    <Paragraph position="5"> This situation is in marked contrast to what normally passes for a logic where, at the very least, the rules of inference must be effective and the theorems recursively enumerable.</Paragraph>
    <Paragraph position="6"> Finally, it is not hard to see that default theories fail to satisfy the extension property \[Hayes 1973\] which all &amp;quot;respectable&amp;quot; logics do satisfy. (A logical calculus has the extension prop-erty iff whenever a formula is provable from a set P of premises, it is also provable from any set P' such that P ~ P'.) \[Kramosil 1975\] attempts to establish some general results on default theories. Kramosil &amp;quot;proves&amp;quot; that for any such theory, the default rules are irrelevant in the sense that either the theory will be meaningless or the theorems of the theory will be precisely the same as those obtainable by ignoring the default rules of inference. Kramosil's result, if correct, would invalidate the  main point of this paper, namely that default theories play a prominent role in reasoning about the world. Fortunately, his &amp;quot;proof&amp;quot; relies on an incorrect definition of theoremhood so that the problem of characterizing the theorems of a default theory remain open.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML