File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/84/p84-1048_metho.xml
Size: 20,382 bytes
Last Modified: 2025-10-06 14:11:36
<?xml version="1.0" standalone="yes"?> <Paper uid="P84-1048"> <Title>Combining Functionality and Ob\]ec~Orientedness for Natural Language Processing</Title> <Section position="3" start_page="0" end_page="219" type="metho"> <SectionTitle> 2. Intermediate Representation and Computational Device for </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="218" type="sub_section"> <SectionTitle> Interpretation 2.1 PAL (Purely Applicative Language) </SectionTitle> <Paragraph position="0"> Effective use of intermediate representations is useful. We propose the use of a language which we call PAL (Purely Applicative Language).</Paragraph> <Paragraph position="1"> In PAL, new composite expressions are constructed only with a binary form of function application. Thus, if z and I/ are well-formed formulas of PAL, so is a form z(y).</Paragraph> <Paragraph position="2"> Expressions of PAL are related to expressions of natural language as follows: Generally, when a phrase consists of its immediate descendants, say z and y, a PAL expression for the phrase is one of the following forms: <z>( <V>) or <p>( <z>) where ~a> stands for a PAL expression for a phrase ~*. Which expression is the case depends on which phrase modifies which. If a phrase z modifies V then the PAL expression for z takes the functor position, i.e., the form is ~z~(~y~). Simple examples are: big apple =* big~apple) ; adjectives modify .anne very big ~ very(big) ; adverbs modify adjectives very big apple ~ (very(big)Xapple) ; reeuesive composition 2As illustrated in this example, we assume a predicate notation a~ an output of the linguistic component. But this choice is only for descriptive purposes and is not significant.</Paragraph> <Paragraph position="3"> awe prefer the term *functionality&quot; to &quot;eompositionality&quot;, reflecting a procedural view rather than a purely mathematicaJ view.</Paragraph> <Paragraph position="4"> How about other cases? In principle, this work is based on Montague's observations \[Montague 74\]. Thus we take the position that noun phrases modify (are functions of, to be more precise) verb phrases. But unlike Montague grammar we do not use iambda expressions to bind case elements. Instead we use special functors standing for case markers. For example, he runs ~ (*subject(he)Xruns) he eats it ~ (*subject(he)X(*object(it)Xeats)) Another example, involving a determiner, is illustrated below: a big apple ~ a(big(apple)) ; determiners modlf~l nouns Sometimes we assume &quot;null&quot; words or items corresponding to morphemes, such as, role indicators, nominalizer, null NP, etc.</Paragraph> <Paragraph position="5"> apple which he eats</Paragraph> <Paragraph position="7"> ; restrictive relative clauses modif~ nouns, ; rdativizers modify sentences to make adjectives In the discussion above, the notion of modify is crucial. What do we mean when we say z modifies y? In the case of Montague grammar, this question is answered based on a predetermined set theoretical model. For example, a noun is interpreted as a set of entities; the noun &quot;penguin&quot;, for instance, is interpreted as a set of all penguins. An adjective, on the other hand, is interpreted as a function from sets of entities to sets of entities; an adjective &quot;small&quot; is interpreted as a selector function which takes such a set of entities (interpretation of each noun) and picks up from it a set of &quot;small&quot; entities. Note that this is a simplified discussion; intension is neglected. Note also that different conception may lead to a different definition of the relation modifp, which will in turn lead to intermediate representations with different function-argument relationships.</Paragraph> <Paragraph position="8"> After all, the choice of semantic representation is relative to the underlying model and how it is interpreted. A good choice of a semantic representation - interpretation pair leads to a less complicated system and makes it easier to realize.</Paragraph> <Paragraph position="9"> The next section discusses a computational device for interpreting PAL expressions.</Paragraph> </Section> <Section position="2" start_page="218" end_page="218" type="sub_section"> <SectionTitle> 2.2 Object-Oriented Domain </SectionTitle> <Paragraph position="0"> The notion of object-orientedness is widely used in computer science. We employ the notion in LOOPS \[Bobrow 81\]. The general idea is as follows: We have a number of objects. Objects can be viewed as both data and procedures. They are data in the sense that they have a place (called a local variable) to store information. At the same time, they are procedures in that they can manipulate data. An object can only update local variables belonging to itself. When data belongs to another object, a message must be sent to request the update. A message consista of a label and its value. In order to send a message, the agent has to know the name of the receiver.</Paragraph> <Paragraph position="1"> There is no other means for manipulating data. Objects can be classified into classes and instances. A class defines a procedure \[called a method) for handling incoming messages of its instances. A class inherits methods of its superclasses.</Paragraph> <Paragraph position="2"> Z. Interpretation of PAL Expressions in Object-Oriented Domain A class is defined for each constant of PAL. A class object for a lexical item contains linguistic knowledge in a procedural form. In other words, a class contains information as to how a corresponding lexical item is mapped into memory structures. A PAL expression is interpreted by evaluating the form which results from replacing each constant of a given PAL expression by an instance of an object whose class name is the same as the label of the constant. The evaluation is done by repeating the following cycle: * an object in argument position sends to an object in functor position a message whose label is &quot;argument ~ and whose value is the object itself.</Paragraph> <Paragraph position="3"> * a corresponding method is invoked and an object is returned as a result of application; usually one object causes another object to modify its content and the result is a modified version of either a functor or an argument.</Paragraph> <Paragraph position="4"> Note that objects can interact only in a constrained way. This is a stronger claim than that allowing arbitrary communication. The more principled and constrained way modules of the linguistic component interact, the less complicated will be the system and therefore the better perspective we can obtain for writing a large grammar.</Paragraph> </Section> <Section position="3" start_page="218" end_page="219" type="sub_section"> <SectionTitle> a.1 A Simple Example </SectionTitle> <Paragraph position="0"> Let's start by seeing how our simple example for a sentence &quot;he runs&quot; is interpreted in our framework. A PAL expression for this sentence is: (*subject(he)Xruus) Class definitions for related objects are shown in figure 3.1. The interpretation process goes as follows: * lnstantiating '*subject': let's call the new instance *subject 0.</Paragraph> <Paragraph position="1"> * lnstantiating 'he': a referent is looked for from the memory. The referent (let's call this i0) is set to the local variable den, which stands for 'denotation'. Let the new instance be he 0.</Paragraph> <Paragraph position="2"> * Evaluating '*subject0(he0)': a message whose label is 'case' and whose value is 'subject' is sent to the object he 0. As a result, he0's variable case has a value 'subject'. The value of the evaluation is a modified version of he0, which we call he I to indicate a different version.</Paragraph> <Paragraph position="3"> * Iustantiating 'runs': let's call the new instance runs 0. An event node (of the memory component) is created and its reference (let's call this e0) is set to the local variable den. Then a new proposition 'takes_place(e0)' is asserted to the memory component.</Paragraph> <Paragraph position="4"> class *subject: argument: scndImcssage , case:subject\]; return\[sc/j~.</Paragraph> <Paragraph position="5"> ; if a message with label 'argument' comes, this method will send to the object pointed to bll the variable rrtessage a message whose label is 'ease' and whose value is 'subject '.</Paragraph> <Paragraph position="6"> ; a variable rrteasage holds the value of an incoming message and a variable self points to the oSjeet itself.</Paragraph> <Paragraph position="7"> class he: if instantiated then dcn*-'look for referent'.</Paragraph> <Paragraph position="8"> ; when a new instance is created, the referent is looked for and the value is set to the local variable den.</Paragraph> <Paragraph position="9"> ease: ease*-messagc; return\[selJ\].</Paragraph> <Paragraph position="10"> ; when a message comes whleh is labeled &quot;ease', the local variable ease will be assigned the value the incoming message contains. The value of this method is the object itself.</Paragraph> <Paragraph position="11"> argument: return\[send\[message, case:sol,l; ; when this instance is applied to another object, this object will send a message whose label is the value of the local variable cone and whose value field is the object itself. The value of the message processing is the value of this application.</Paragraph> <Paragraph position="12"> class runs: if instantiated then den.---ereate\['event:run'\]; assert\[takes_ place(den)\].</Paragraph> <Paragraph position="13"> ; when a new instance of class '~ns&quot; is instantiuted t a new C/oent will be asserted to the memorf eornpanent. The referenee to the new event is set to the local variable den.</Paragraph> <Paragraph position="14"> subject: assert\['agent~den)~message.den'\]; return\[sel~. ; when a message with label 'subject' comes, a new proposition is asserted to the mernor~ component. 7he value of this message handling is this obfeet itself.</Paragraph> </Section> <Section position="4" start_page="219" end_page="219" type="sub_section"> <SectionTitle> 3.3 Linking Case Elements </SectionTitle> <Paragraph position="0"> One of the basic tasks of the linguistic component is to find out which constituent is linked explicitly or implicitly to which constituent. From the example shown in section 3.1, the reader can see at least three possibilities: Case linking by sending messages. Using conventional terms of case grammar, we can say that &quot;governer&quot; receives a message whose label is a surface ease and whose value is the &quot;dependant'. This implementation leads us to the notion of abstraction to be discussed in section 3.4.</Paragraph> <Paragraph position="1"> Lexleon-drlven methods of determining deep ease.</Paragraph> <Paragraph position="2"> Surface case is converted into deep case by a method defined for each governer. This makes it possible to handle this hard problem without being concerned with how many different meanings each function word has. Governers which have the same characteristics in this respect can be grouped together as a superclass. This enables to avoid duplication of knowledge by means of hierarchy. The latter issue is discussed in section 3.2.</Paragraph> <Paragraph position="3"> The use of implicit case markers. We call items such as *subject or *object implicit, as they do not appear in the surface form, as opposed to prepositions, which are explicit (surface} markers. The introduction of implicit case marker seems to be reasonable if we see a language like Japanese in which surface case is explicitly indicated by postpositions.</Paragraph> <Paragraph position="4"> Thus we can assign to the translation of our sample sentence a PAL expression with the same structure as its English version:</Paragraph> </Section> </Section> <Section position="4" start_page="219" end_page="220" type="metho"> <SectionTitle> KARE GA HASHIRU ~ (GA(KARE)XHASHIRU) </SectionTitle> <Paragraph position="0"> where, &quot;KARE&quot; means &quot;he&quot;, &quot;GA&quot; postposition indicating surface subject, &quot;HASHIRU&quot; &quot;run ~, respectively.</Paragraph> <Paragraph position="1"> * Evaluating hel(runs0): a message whose label is 'subject' and whose value is he ! is sent to runs0, which causes a new proposition 'agent(e0)--i 0, to be asserted in the memory component. The final result of the evaluation is a new version of the object runs0, say runs 1.</Paragraph> <Paragraph position="2"> The above discussion is overly simplified for the purpose of explanation. The following sections discuss a number of other issues.</Paragraph> <Section position="1" start_page="219" end_page="219" type="sub_section"> <SectionTitle> 3.2 Sharing Common Knowledge </SectionTitle> <Paragraph position="0"> Object-oriented systems use the notion of hierarchy to share common procedures. Lexical items with similar eharacterics can be grouped together as a class; we may, for example, have a class 'noun' as a superclass of lexicai items 'boy', 'girl', 'computer' and so forth. ~,Vhen a difference is recognized among objects of a class, the class may be subdivided; we may subcategorize a verb into static verbs, action verbs, achievement verbs, etc. Common properties can be shared at the supercla~s. This offers a flexible way for writing a large grammar; one may start by defining both most general classes and least general classes. The more observations are obtained, the richer will be the class-superclass network. Additionally, mechanisms for supporting a multiple hierarchy and for borrowing a method are useful in coping with sophistication of linguistic knowledge, e.g., introduction of more than one subcategorization.</Paragraph> </Section> <Section position="2" start_page="219" end_page="220" type="sub_section"> <SectionTitle> 3.4 Abstraction </SectionTitle> <Paragraph position="0"> By attaching a sort of a message controller in front of an object, we can have a new version of the object whose linguistic knowledge is essentially the same as the original one but whose input/output specification is different. As a typical example we can show how a passivizer *en is dealt with. An object *en can have an embedded object as a value of its local variable embedded. If an instance of *en receives a message with label '*subject', then it will send to the object pointed by embedded the message with its label replaced by '*object'; if it receives a message with label 'by', then it will transfer the message to the &quot;embedded&quot; object by replacing the label field by '*subject'.</Paragraph> <Paragraph position="1"> Thus the object *en coupled with a transitive verb can be viewed as if they were a single intransitive verb. This offers an abstracted way of handling linguistic objects.</Paragraph> <Paragraph position="2"> The effect can be seen by tracing how a PAL expression:</Paragraph> <Paragraph position="4"/> </Section> <Section position="3" start_page="220" end_page="220" type="sub_section"> <SectionTitle> 3.5 Impiieit Case Linklng </SectionTitle> <Paragraph position="0"> We can use a very similar mechanism to deal with case linking by causative verbs. Consider the following sentence: z wants It to do z.</Paragraph> <Paragraph position="1"> This sentence implies that the subject of the infinitive is the grammatical object of the main verb &quot;wants ~. Such a property can be shared by a number of other verbs such as &quot;allow ~, &quot;cause ~, &quot;leC, &quot;make&quot;, etc. In the object-oriented implementation, this can be handled by letting the object defined for this class transfer a message from its subject to the infinitive.</Paragraph> <Paragraph position="2"> Note that the object for the these verbs must pass the message from its subject to the infinitive when its grammatical object is missing.</Paragraph> <Paragraph position="3"> Another example of implicit case linking can be seen in relative clauses. In an object-oriented implementation, a relativizer transfers a message containing a pointer to the head noun to a null NP occupying the gap in the relative clause. Intermediate objects serve as re-transmitting nodes as in computer networks.</Paragraph> </Section> <Section position="4" start_page="220" end_page="220" type="sub_section"> <SectionTitle> 3.6 Obligatory Case versus Non-Obligatory Case </SectionTitle> <Paragraph position="0"> In building a practical system, the problem of distinguishing obligatory case and non-obligatory case is always controversial.</Paragraph> <Paragraph position="1"> The notion of hierarchy is useful in dealing with this problem in a &quot;lazy&quot; fashion. What we means by this is as follows: In procedural approach, the distinction we make between obligatory and non-obligatory cases seems to be based on economical reason. To put this another way, we do not want to let each lexical item have cases such as locative, instrumental, temporal, etc. This would merely mean useless duplication of knowledge. We can use the notion of hierarchy to share methods for these cases. Any exceptional method can be attached to lower level items.</Paragraph> <Paragraph position="2"> For example, we can define a class &quot;action verb&quot; which has methods for instrumental cases, while its superclass ~verb ~ may not.</Paragraph> <Paragraph position="3"> This is useful for not only reflecting linguistic generalization but also offering a grammar designer a flexible means for designing a knowledge base.</Paragraph> </Section> </Section> <Section position="5" start_page="220" end_page="220" type="metho"> <SectionTitle> 4. A Few Remarks </SectionTitle> <Paragraph position="0"> As is often pointed out, there are a lot of relationships which can be determined purely by examining linguistic structure.</Paragraph> <Paragraph position="1"> For example, presupposition, intra-sentential reference, focus, surface speech acts, etc. This eventually means that the linguistic component itself is domain independent.</Paragraph> <Paragraph position="2"> However, other issues such as, resolving ambiguity, resolving task-dependent reference, filling task-dependent ellipsis, or inferring the speaker's intention, cannot be solved solely by the linguistic component \[Sehank 80\]. They require interaction with the memory component. Thus the domain dependent information must be stored in the memory component.</Paragraph> <Paragraph position="3"> To go beyond the semantics-on-top-of-syntax paradigm, we must allow rich interaction between the memory and linguistic components. In particular, the memory component must be able to predict a structure, to guide the parsing process, or to give a low rating to a partial structure which is not plausible based on the experience, while the linguistic component must be able to explain what is going on and what it tries to see.</Paragraph> <Paragraph position="4"> To do this, the notion of object-orientedness provides a fairly flexible method of interaction.</Paragraph> <Paragraph position="5"> Finally, we would like to mention how this framework differs from the authors' previous work on machine translation \[Nishida 83\], which could be viewed as an instantiation of this framework. The difference is that in the previous work, the notion of lambda binding is used for linking cases. We directly used inteusional logic of Montague grammar as an intermediate language. Though it brought some advantages, this scheme caused a number of technical problems. First, using lambda forms causes difficulty in procedural interpretation. In the case of Montague grammar this is not so, because the amount of computation doet not cause any theoretical problem in a mathematical theory.</Paragraph> <Paragraph position="6"> Second, though lambda expressions give an explicit form of representing some linguistic relations, other relations remain implicit. Some sort of additional mechanism should be introduced to cope with those implicit relations. Such a mechanism, however, may spoil the clarity or explicitness of lambda forms. This paper has proposed an alternative to address these problems.</Paragraph> </Section> class="xml-element"></Paper>