XML Viewer - p88-1011

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/88/p88-1011_abstr.xml
Size: 19,149 bytes
Last Modified: 2025-10-06 13:46:40
<?xml version="1.0" standalone="yes"?>
<Paper uid="P88-1011">
  <Title>A Logic for Semantic Interpretation I</Title>
  <Section position="1" start_page="0" end_page="90" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> We propose that logic (enhanced to encode probability information) is a good way of characterizing semantic interpretation. In support of this we give a fragment of an axiomatization for word-sense disambiguation, noun-phrase (and verb) reference, and case disambiguation.</Paragraph>
    <Paragraph position="1"> We describe an inference engine (Frail3) which actually takes this axiomatization and uses it to drive the semantic interpretation process. We claim three benefits from this scheme. First, the interface between semantic interpretation and pragmatics has always been problematic, since all of the above tasks in general require pragmatic inference. Now the interface is trivial, since both semantic interpretation and pragmatics use the same vocabulary and inference engine. The second benefit, related to the first, is that semantic guidance of syntax is a side effect of the interpretation. The third benefit is the elegance of the semantic interpretation theory. A few simple rules capture a remarkable diversity of semantic phenomena.</Paragraph>
    <Paragraph position="2"> I. Introduction The use of logic to codify natural language syntax is well known, and many current systems can parse directly off their axiomatizations (e.g.,)\[l\]. Many of these systems simultaneously construct an intermediate &amp;quot;logical form&amp;quot; using the same machinery. At the other end of language processing, logic is a well-known tool for expressing the pragmatic information needed for plan recognition and speech act recognition \[2-4\]. In between these extremes logic appears much less. There has been some movement in the direction of placing semantic interpretation on a more logical footing \[5,6\], but it is nothing like what has happened at the extremes of the ~anguage understanding process.</Paragraph>
    <Paragraph position="3"> To some degree this is understandable. These &amp;quot;middle&amp;quot; parts, such as word-sense disambiguation, noun phrase reference, case disambiguation, etc. are notoriously difficult, and poorly understood, at least compared to things like syntax, and the construction of intermediate logical form. Much of the reason these areas are l This work has been supported in part by the National Science Foundation under grants IST 8416034 and IST 8515005 and Office ~)f Nav~l Research under grant N00014-79-C-0529.</Paragraph>
    <Paragraph position="4"> so dark is that they are intimately bound up with pragmatic reasoning. The correct sense of a word depends on context, as does pronoun resolution, etc.</Paragraph>
    <Paragraph position="5"> Here we rectify this situation by presenting an axiomatization of fragment of semantic interpretation, notably including many aspects previously excluded: word-sense disambiguation, noun-phrase reference determination, case determination, and syntactic disambiguation. Furthermore we describe an inference engine, Frail3, which can use the logical formulation to carry out semantic interpretation. The description of Frail3 is brief, since the present paper is primarily concerned with semantic interpretation. For a more detailed description, see \[7\].</Paragraph>
    <Paragraph position="6"> The work closest to what we present is that by Hobbs \[5\]; however, he handles only noun-phrase reference from the above list, and he does not consider intersentential influences at all.</Paragraph>
    <Paragraph position="7"> Our system, Wimp2 (which uses Frail3), is quite pretty in *,wo respects. First, it integrates semantic and pragmatic processing into a uniform whole, all done in the logic. Secondly, it provides an elegant and concise way to specify exactly what has to be done by a semantic interpreter. As we shall see, a system that is roughly comparable to other state-of-the-art semantic interpretation systems \[6,8\] can be written down in a pagc or so of logical rules.</Paragraph>
    <Paragraph position="8"> Wimp2 has been implemented and works on all of the examples in this paper.</Paragraph>
    <Paragraph position="9"> II. Vocabularies  Let us start by giving an informal semantics for the special predicates and terms used by the system. Since we are doing semantic interpretation, we are translating between a syntactic tree on one hand and the logical, or internal, representation on the other. Thus.we distinguish three vocabularies: one for trees, one for the internal representation, and one to aid in the translation between the two.</Paragraph>
    <Paragraph position="10"> The vocabulary for syntactic trees assumes that each word in the sentence is represented as a word instance which is represented as a word with a numerical postfix (e.g., boy22). A word instance is associated with the actual lexical entry by the predicate word-inst: (word-inst word-instance part-ofospeech lexwal-item).</Paragraph>
    <Paragraph position="11"> For example, (word-inst case26 noun case). (We use &amp;quot;part of speech&amp;quot; to denote those syntactic categories that are directly above the terminal symbols in the grammars, that is, directly above words.) The relations between word instances are encoded with two predicates: syn-pos, and syn-pp. Syn-pos (syn-pos relation head sub-constituent), indicates that the sub-constituent is the relation of the head. We distinguish between positional relations and those indicated by prepositional phrases, which use the predicate syn-pp, but otherwise look the same. The propositions denoting syntactic relations are generated during the parse. The parser follows all possible parses in a breadth-first search and outputs propositions on a word-by-word basis. If there is more than one parse and they disagree on the propositional output, a disjunction of the outputs is a.~ert.ed into the database. The correspondence between trees and formulas is as follows:  This is enough to express a wide variety of simple declarative sentences. Furthermore, since our current parser implements a transformational account of imperatives, questions (both yes-no and wh), complement constructions, and subordinate clauses, these are automatically handled by the above as well. For example, given an account of &amp;quot;Jack wants to borrow the book.&amp;quot; as derived from &amp;quot;Jack wants (np that (s Jack borrow the book)).&amp;quot; or something similar, then the above rules would produce the following for both (we also indicate after what word  This is, of course, a fragment, and most things are not handled by this analysis: negation, noun-noun combinations, particles, auxiliary verbs, etc.</Paragraph>
    <Paragraph position="12"> Now let us consider the internal representation used for inference about the world. Here we use a simple predicate-calculus version of frames, and slots. We assume only two predicates for this: == and inst. Inst, (inst instance frame), is a two-place predicate on an instance of a frame and the frame itself, where a &amp;quot;frame&amp;quot; is a set of objects, all of which are of the same natural kind. Thus (inst boyl boy-) asserts that boyl is a member of the set of boys, denoted by boy-. (Frames are symbols containing hyphens, e.g., supermarket-shoping. Where a single English word is sufficiently descriptive, the hyphen is put at the end.) The other predicate used to describe the world is the %etter name&amp;quot; relation ==: (---- worse-name better-name).</Paragraph>
    <Paragraph position="13"> This is a restricted use of equality. The second argument is a &amp;quot;better name&amp;quot; for the first, and thus may be freely substituted for it (but not the reverse). Since slots are represented as functions, -- is used to fill slots in frames. To fill the agent slot of a particular action, say borrowl, with a particular person, say jackl, we say (== (agent borrow1)jack1).</Paragraph>
    <Paragraph position="14"> At an implementation level, -= causes everything known about its first argument (the worse name) to be asserted about the second (the better name). This has the effect.</Paragraph>
    <Paragraph position="15"> of concentrating all knowledge about all of an object's names as facts about the best name.</Paragraph>
    <Paragraph position="16"> Frail will take as input a simple frame representation and translate it into predicate-calculus form. Figure 1 shows a frame for shopping along with the predicate-calculus translation.</Paragraph>
    <Paragraph position="17"> Naturally, a realistic world model requires more than these two predicates plus slot functions, but the relative success of fairly simple frame models of reasoning indicates that they are a good starting set. The last set of predicates (word-sense, case, and roie-inst) are used in the translation itself. They will be defined later.</Paragraph>
    <Paragraph position="18">  We can now write down some semantic interpretation rules. Let us assume that all words in English have one or more word senses as their meaning, that these word senses correspond to frames, and that any particular word instance has as its meaning exactly one of these senses. We can express this fact for the instances of any particular lexical entry as follows: (word-inst inst part-of.speech word) =~ (inst rest sense1) V ... V (inst inst sense,=) where sense1 through sense,= are senses of word when it is used as a part.of.speech (i.e., as a noun, verb, etc.) Not all words in English have meanings in this sense.</Paragraph>
    <Paragraph position="19"> &amp;quot;The&amp;quot; is an obvious example. Rather than complicate the above rules, we assign such words a &amp;quot;null&amp;quot; meaning, which we represent by the term garbage*. Nothing is known about garbage* so this has no consequences.</Paragraph>
    <Paragraph position="20"> A better axiomatization would also include words which seem to correspond to functions (e.g., age), but we ignore such complications.</Paragraph>
    <Paragraph position="21"> A minor problem with the above rule is that it requires us to be able to say at the outset (i.e., when we load the program) what all the word senses are, and new senses cannot be added in a modular fashion. To fix this we introduce a new predicate, word-sense: (word-sense lez-item part-of-speech frame) (word-sense straw noun drink-straw) (word-sense straw noun animal-straw).</Paragraph>
    <Paragraph position="22"> This states that let-item when used as a part.of.speech can mean frame.</Paragraph>
    <Paragraph position="23"> We also introduce a pragmatically difl'erent form of disjunction, --OR: (~OR formulal formula2).</Paragraph>
    <Paragraph position="24"> In terms of implementation, think of this as inferring formula1 in all possible ways and then asserting the disjunction of the formula,s with each set of bindings. So if there are two seLs of bindings, the result will be to assert 89 (OR f ormula2/biltdingsl f ormula2/bindings~ ).</Paragraph>
    <Paragraph position="25"> Logically, the meaning of --OR is that if xl ... x, are unbound variables i, for'rnulal, then there nmst exist xl ... z, that make formulal and formula2 true.</Paragraph>
    <Paragraph position="26"> We can now express our rule of word-sense ambiguity as: (word-inst ?instance ?part-of-speech ?lex-item) =:, (--OR (word-sense ?lex-item ?part-of-speech ?frame) (inst ?instance ?frame)) IV. The Inference Engine While it seems clear that the above rule expresses a rather simple-minded idea of how words relate to their meanings, its computational import may not be so clear. Thus we now discuss Wimp2, our language comprehension program, and its inference engine, Frail3.</Paragraph>
    <Paragraph position="27"> Like most rule-based systems, Frail distinguishes forward and backward-chaining use of modus-ponens. All of our semantic interpretation rules are forward-chaining rules'.</Paragraph>
    <Paragraph position="28"> (--- (word-inst ?instance ?part-of-speech ?lex-item) (--OR (word-sense ?lex-item ?part-of-speech ?frame) (inst ?instance ?frame))) Thus, whenever a new word instance is asserted, we forward-chain to a statement that the word denotes an instance of one of a set of frames.</Paragraph>
    <Paragraph position="29"> Next, Frail uses an ATMS \[9,10\] to keep track of disjunctions. That is, when we assert (OR formulal ... formula,=) we create n assumptions (following DeKleer, these are simply integers) and assert each formula into the data-base, each with a label indicating that the formula is not true but only true given some assumptions. Here is an example of how some simple disjunctions come  has the label ((13)), which means that it is true if we grant assumptions 1 and 3. If an assumption (or more generally, a set of assumptions) leads to a contradiction, the assumption is declared a &amp;quot;nogood&amp;quot; and formulas which depend on it are no longer believed. Thus if we learn (not D) then (1 3 / is x nogood. This also has the consequence that E now has the label (1/. It is as if different sets of assumptions correspond to different worlds. Semantic interpretation then is finding the &amp;quot;best&amp;quot; of the worlds defined by the linguistic possibilities.</Paragraph>
    <Paragraph position="31"> We said &amp;quot;best&amp;quot; ill the last sentence deliberately.</Paragraph>
    <Paragraph position="32"> When alternatives can be ruled out on logical grounds the corresponding assumptions become nogoods, and conclusions from them go away. But it is rare that. all of the candidate interpretations (of words, of referents, etc.) reduce to only one that is logically possible. Rather, there are ilsually several which are logically .co,sistent, but some are more &amp;quot;probable&amp;quot; than others, For this rea.so,, Frail associates probabilities with sets of assumptions (&amp;quot;alternative worlds&amp;quot;) and Wimp eventually &amp;quot;garbage collects&amp;quot; statements which remain low-probability alter,atives because their assumptions are unlikely. Probabilities also guide which interpretation to explore. Exactly how this works is described in \[7\]. Here we will simply note that the probabilities are designed to capture the following intuitions:  1. Uncommon vs. common word-senses {marked vs.</Paragraph>
    <Paragraph position="33"> unmarked) are indicated by probabilities input by the system designer and stored in the lexicon.</Paragraph>
    <Paragraph position="34"> 2. Wimp prefers to find referents for entities (rather than not finding referents).</Paragraph>
    <Paragraph position="35"> 3. Possible reasons for actions and entities are preferred the more specific they are to. the action or entity.</Paragraph>
    <Paragraph position="36"> (E.g., &amp;quot;shopping&amp;quot; is given a higher probability than &amp;quot;meeting someone&amp;quot; as an explanation for going to the supermarket.) 4. Formulas derived in two differents ways are more probable than they would have been if derived in either way alone.</Paragraph>
    <Paragraph position="37"> 5. Disjunctions which lead to already considered &amp;quot;'worlds&amp;quot; are preferred over those which do not hook up in this way. (We will illustrate this later.} V. Case Disarnbiguation  Cases are indicated by positional relations (e.g., subject) and prepositional phrases. We make the simplifying assumption that prepositional phrases only indicate case relations. As we did for word-sense disambiguation, we introduce a new predicate that allows us to incrementally specify how a particular head (a noun or verb) relates to its syntactic roles. The new predicate, (case head syntactic-relation slot),  states that head can have its slol filled by things which stand itl syntacttc.lvlation to it. For example 0nst ?g go-) =~ (case ?g subject agent).</Paragraph>
    <Paragraph position="38"> This Call also be expressed in Frail using the typed variables null (case ?g.go- subject agent).</Paragraph>
    <Paragraph position="39"> This says that any instance of a go- can use the subject position to indicate the agent of the go- event. These facts can be inherited in the typical way via the isa hierarchy, so this fact would more generally be expressed as (case ?a.action- subject agent), Using case and the previously introduced --OR connective, we can express the rule of case relations. Formally, it says that for all syntactic positional relations and all meanings of the head, there must exist a case relation which is the significance of that syntactic position: (syn-pos ?tel ?head ?val) A (inst ?head ?frame) =~ ('--*OR (case ?hea~l ?tel ?slot) (== (?slot ?hesd) ?val))) So, we might have (syn-pos subject gol jackl) A (inst gol go-) h (case gol subject agent) ::~ ('--- (agent gol)jackl).</Paragraph>
    <Paragraph position="40"> A similar rule holds for case relations indicated by prepositional phrases.</Paragraph>
    <Paragraph position="41"> (syn-pp head-prep ?head ?pinst) A (syn-pp prep-np ?pinst ?np) A (word-inst ?pinst prep ?prep) A (inst ?head ?frame) =~ (--&amp;quot;OR (case ?head ?prep ?slot) (=--- (7slot ?head) ?np)) For example, &amp;quot;Jack went to the supermarket.&amp;quot; would give us (syn-pp head-prep gol tol) A (case gol to destination) A (syn-pp prep-np to1 supermarket1) A (word-inst tol prep to) A (;nst gol go-) =~ (== (destination go1) supermarketl).</Paragraph>
    <Paragraph position="42"> We now have enough machinery to describe two ways in which word senses and case relations can help disambiguate each other. First consider the sentence Jack went to the supermarket.</Paragraph>
    <Paragraph position="43"> Wimp currently knows two meanings of &amp;quot;go,&amp;quot; to travel and to die. After &amp;quot;Jack went&amp;quot; Wimp prefers travel (based upon probability rule 1 and the probabilities assigned to these two readings in the lexicon) but both are possible. After &amp;quot;Jack went to&amp;quot; the die reading goes away. This is because the only formulas satisfying (case gol to ?slot) all require gol to be a travel rather than a die. Thus &amp;quot;die&amp;quot; cannot be a reading since it makes (~OR (case ?head ?prep ?slot) (---- (?slot ?head) ?val)) false (a disjunction of zero disjuncts is false).</Paragraph>
    <Paragraph position="44"> We also have enough machinery to see how &amp;quot;'selectional restrictions&amp;quot; work in Wimp2. Consider the sentence null Jack fell at the store.</Paragraph>
    <Paragraph position="45"> and suppose that Wimp knows two case relatious for &amp;quot;'at,&amp;quot; Ioc and time. This will initially lead to the following disjunction:</Paragraph>
    <Paragraph position="47"> However, Wimp will know that (inst (time ?a.aetion) time-).</Paragraph>
    <Paragraph position="48"> As we mentioned earlier, == statements cause everything known about the first argument to be asserted about the second. Thus Wimp will try to believe that store1 is a time, so (2) becomes a nogood and (1) becomes just tmte. It is important to note that both of these disambiguation methods fall out from the basics of the system. Nothing had to be added.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML