File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1013_metho.xml
Size: 24,015 bytes
Last Modified: 2025-10-06 14:11:18
<?xml version="1.0" standalone="yes"?> <Paper uid="C80-1013"> <Title>HIERARCHICAL MEANING REPRESENTATION AND ANALYSIS OF NATURAL LANGUAGE DOCUMENTS</Title> <Section position="3" start_page="0" end_page="88" type="metho"> <SectionTitle> NATURAL LANGUAGE ANALYZER & LE .&. PSN ---. task oriented representations </SectionTitle> <Paragraph position="0"> where NL: natural language, LE: logical expression, PSN: partitioned semantic network, T: translation mapping, I: interpretation mapping.</Paragraph> <Paragraph position="1"> We have developed natural language analyzers for English and Japanese respectively. This paper describes the one for English. The experiments are in progress with these systems. The applicability of the proposed approach is discussed briefly.</Paragraph> <Paragraph position="2"> 2. Overview of the approach This section gives the readers an overview of the system by illustrating an example. Before illustration we shall present the formalisms of LE and PSN.</Paragraph> <Paragraph position="3"> LE -- lo$ical expression The notion of LE is based on Cresswell's %-categorial language\[2\]. The following is the syntax of LE: - the set of syntactic categories (Syn): two basic categories are used, i.e., 0 of sentence and 1 of name. Given categories T,O~, ... ,O,, then <T,Ol, ... ,On> is the category of a mapping that makes an expression of category T out of expressions of category ol, ... ,~, respectively. - the set of symbols (F): F = #~y~FO, where Fo is a finite set of symbols, and if Ol#o 2 then FO,~F~=~.</Paragraph> <Paragraph position="4"> - the set of variables (X): X= ~5~Xo, where Xo is a set of variables such that if ~i#o~ then X~znXa~=~ , and that intersection of F and X is empty.</Paragraph> <Paragraph position="5"> - the set of expressions (E): E = ~ ~E~, where E~ fills the following properties: (i) Xo GEo, (ii) FocE~, (iii) if ~ gE<T,Oi, ... ,~a> and ~I, ... ,~gEot, ... ,E~a, then the expression ~(~I, ... ,~n) PSET, (iv) if B gXo and ~cET, then the expression 18\[~\] e E<Y,O>, where I is a distinguished symbol in E. PSN -- partitioned semantic network PSN denotes the semantics of LE. The notion of network partitioning is based on Hendrix's K-net \[6\]. The constituents of PSN are: - a space which denotes a possible world - a typed nodes - an arc with a case label &quot;Every man loves a woman.&quot; Sometimes linear notations are used instead of PSN structure. The linear notation can be further interpreted by meta language \[8\], however this is beyond the scope of this paper. Fig. 1 illustrates PSN structures together with linear notations.</Paragraph> <Paragraph position="6"> (with linear notations).</Paragraph> <Paragraph position="7"> Some special structures are used to denote intensional entities. See fig, 2 below.</Paragraph> <Paragraph position="9"> of. gx\[p(x)\] DEF\[?X; P(?X)\] SETOF\[?X; P(?X)\] of. Ix\[p(x)\] el. {xlp(x)} Fig. 2. Intensional structures in PSN. Overview of the meanin$ analysis process Now we illustrate an example. Consider the following simple sentence: (EX-I) Every man loves a woman.</Paragraph> <Paragraph position="10"> (STEP i) Morpholo$ical analysis Input sentence is analyzed by consulting dictionary in which LE expression and grammatical category are assigned to each word. For the given sentence (EX-i), we obtain: Every: cat=DET, sem=EVERY man: cat=NOUN, sem=MAN loves: cat=VT, sem=LOVE, (form=+S) a: cat=DET, sem=A woman: cat=NOUN, sem=WOMAN (STEP 2) Syntax analysis and seneration of LE Morphologically analyzed sentence is further analyzed by a set of grammar rules. Each grammar rule consists of syntactic generation part and semantic composition part. For the sake of illustration, let the grammar rules be:</Paragraph> <Paragraph position="12"> The syntax analysis and semantic composition are done in parallel. If one of them detects anomaly, the application of the rule is aborted.</Paragraph> <Paragraph position="13"> Fig. 3 illustrates the result of the syntax analysis and semantic composition for our example. The syntax tree shows the phrase structure of the sentence. The semantic tree shows the history of semantic composition. The root node of the semantic tree is the LE (in LISP notation) obtained from the sentence.</Paragraph> <Paragraph position="14"> predicate LOVE is generated.</Paragraph> <Paragraph position="15"> I &quot;7- ~a/ \.o .\[ WOMAN\]I NDEF~,- ?Y~' , :!\] (4) to replace the OBJECT slot of LOVE by its extensi0n Since the verb &quot;love&quot; is an extensional verb, the OBJECT slot is extensioned, i.e., is replaced by an existentially quantified variable. In our system new individual node is generated in the sense of Skolem constant. We treat scope ambiguity at this time; if this Skolemization is to be done in a local world, scope ambiguity is announced. In this case three ambiguities are detected as for where the Since the reading (i) and (ii) are logically equivalent, there are essentially two ambiguities. If the reading (i) is selected, the final network structure is:</Paragraph> <Paragraph position="17"> Comments on scope ambiguity One of the interesting feature of MG is the treatment of scope ambiguity of quantification. In MG, scope ambiguities are captured as ambiguities of semantic composition. However, considering the following two points: - how to filter out redundancies; sometimes this redundancy is reduced in the interpretation process, - the resulting parsing inaccurate readings, sometimes involves it is plausible to treat scope ambiguities in the interpretation process as shown in the above example.</Paragraph> <Paragraph position="18"> 3. Implementation of translation mapping (T) This section treats the translation mapping T. Firstly, we show how we associate LE with each phrase of English. Then we describe the rule based parser.</Paragraph> <Paragraph position="19"> The association of LE with En$1ish phrases (l) Simple sentence Simple sentence is composed of a subject and a verb phrase. Subject is a noun phrase. The LE for a noun phrase is in category <0,<0,i>>. The LE for a verb phrase is in category <0,i>. The LE for a sentence is a functional composition of an NG and a VP. See the following illustration:</Paragraph> <Paragraph position="21"> (2) Verb phrase Basic part of a verb phrase is composed of either an intransitive verb or a transitive verb plus an object. For example, The LE for a phrase &quot;have a book&quot; is obtained as follows: Ix 1\[ (a(book)) (ly l\[possess(x,y) \]) \] e E<0,1> possess ~ E<O,i,i> a(book) e E<0,<0,1>> (3) Noun phrase A noun phrase maps a one-place predicate into a sentence, that is, in category <0,<0,i>>. The constituents of a noun phrase are: - determiner (DET) in E<<O,<0,1>>,<0,1>> - number (NBR) in E<<0,1>,<0,1>> - adjective (ADJ) in E<<0,i>,<O,i>> - head noun (NOUN) in E<0,1> - plural morpheme (+S) in E<<0,1>,<O,I>> - post modifier (Q) in E<<0,1>,<0,1>> (example) &quot;the two efficient algorithms&quot; the(two(*pl(efficient(algorithm)))) E E<O,<0,1>> the~two (*el (~ef ficient (algorithm))) ~o *m~ifef: \[cie~nt(~l ith &quot; efficient(algorithm) *pl I I the two efficient algorithm +s (DET)(NBR) (ADJ) (NOUN) (+S) (4) Postmodifier (i) Relative clause A relative clause is composed of the symbol 'which' and a sentence. 'Which' makes a postmodifier of a noun out of the sentence. A special symbol #ante (in E<O,<0,1>>) is supplied for the eliminated antecedent in the relative clause. See the following example: The LE for a preposition is in category <<0,0>,i>, that is, makes an adverb out of a name. The LE for an adjective prepositional phrase is constructed using a special symbol *ape E<<<<0,1>,<O,i>>,i>,<<0,0>,i>>. Roughly speaking *ap converts an preposition into &quot;an adjective preposition&quot; which makes an adjective phrase out of a name. See the following example: ..... PS E<<0, i>, <0,i>> Ip< n 1~\[~yl\[(theisystem)) ~'~- ~ (tx, \[ ((*ap (of)) (x)) (p (y)) \]) \] \] *ap (of) the (system) of the system (5) Noun clause A noun clause is constructed from a key word (e.g., &quot;that&quot;, &quot;whether&quot;, etc) and a complement sentence. The LE for the keyword maps a sentence into a noun phrase, that is, in category <<0,<0,1>>,0>. See the following example:</Paragraph> <Paragraph position="23"> whether it accept the input Indirect questions and direct questions are treated uniformly. For YES-NO questions, a symbol (whether) is used which maps a sentence into a noun clause. For example, &quot;Does he run?&quot; ~#QUES(whether(he(run))) EE0 For WH-questions, see the following example: &quot;Who runs?&quot; ~#QUES(who(#ante(run))) E E0 where, the symbol (who) maps a sentence into a noun clause.</Paragraph> <Paragraph position="24"> This section describes a computer program which analyzes input sentence and translates it into LE. The set of rules defined in this section so far are given to the parser in the following format: <advice>, <score> A ~ <sem> where ~ is a sequence of nonterminals or nonterminals with holes.</Paragraph> <Paragraph position="25"> The <advice> section treats syntax augmentation of a rule by means of message passing and testing mechanism. A program is embedded which tests whether the messages received from descendants are consistent and which may also send messages to its parent. These messages convey syntactic information about number, person, ease, verb form, ~tc. The format of a message is: ... .> . ) ( <attributei>=<value I .. For example, see the following illustration:</Paragraph> <Paragraph position="27"> He has a book.</Paragraph> <Paragraph position="28"> The <sem> section is a semantic composition program which will construct LE for the node from decendant nodes. In implementing programs, the use of semantic markers is effective. A semantic marker conveys some auxiliary information approximately describing semantic constraints. The LE and semantic markers for a node are packed into a data structure, called a word frame D and manipulated by <sem> section programs.</Paragraph> <Paragraph position="29"> The <score> section determines the priority of the rule. A rule with the highest priority will be tried first.</Paragraph> <Paragraph position="30"> The grammar system has a feature that allows a user to write elimination rules directly. For example, the following is a rule for a relative clause:</Paragraph> </Section> <Section position="4" start_page="88" end_page="90" type="metho"> <SectionTitle> NP -~NP+(CLAUSE-NP) </SectionTitle> <Paragraph position="0"> This means that a relative clause is a clause wlth Just one NP eliminated. The semantic coupling of the antecedent and the eliminated noun phrase is described in the <sem> section of the rule.</Paragraph> <Paragraph position="1"> Now we shall go into the detail of the parser, called EASY (for the English Analysis SYstem). The organization of EASY is summarized in the following diagram: input sentence</Paragraph> <Paragraph position="3"> Before starting parsing, given set of rules are pre-compiled. Nonterminal nodes are connected together and a data structure like ATNG is generated. For example, if we compile the example grammar given in section 2, the following structure (called an expectation path) is generated for the nonterminal DET:</Paragraph> <Paragraph position="5"> This reads that a DET will grow up to be an NG if a NOUN follows it, and the NG will, in turn, grow up to be a SENTENCE if a VP follows it.</Paragraph> <Paragraph position="6"> The rule interpreter analyzes input sentence with this compiled rules and a dictionary. EASY is a top-down parser and reads input sentences from the left to the right. EASY starts parsing by expecting the node SENTENCE. The main loop of the rule interpreter is: - test if the current word has an expectation path to the expected node, - if the path is found, select the path with the highest priority and save other paths, - if no path is found, try the following two rules: (i) try a left recursive rule since this type of rule is not compiled in the pre-compile phase, and (~) test if the expected node is eliminated via antecedent elimination rule, - if both of them fail, memorize the failure and backtrack.</Paragraph> <Paragraph position="7"> 4. Implementation of interpretation mapping ~I) The interpretation mapping I generates a partitioned network structure as a denotation of the meaning of a sentence.</Paragraph> <Paragraph position="8"> We don't use the truth-conditional formalism. If complete knowledge about the world is givens a computer program can simulate the model to compute the truth value as in \[4\] or \[7\]. However in the actual situation of natural language understanding process, complete knowlegde cannot be given, but only partial knowledge is available. Accordingly, it is plausible that new knowledge is acquired from a given sentence in the context of old knowledge structure. For this purpose, Montague's truth conditional approach is indirect and more direct a programming language.</Paragraph> <Paragraph position="9"> In what follows we try a direct approach.</Paragraph> <Paragraph position="10"> The style of generating networks resembles Scott-Strachey's semantic function \[13\] which generates a denotation from a statement of programming language.</Paragraph> <Paragraph position="11"> In order to generate network structure, we use a system which consists of a supervisor function GEN plus dictionary, The arguments of the supervisor are: (LE, space#, environment, message).</Paragraph> <Paragraph position="12"> LE is a logical expression. The space# specifies the space in which LE is interpreted. The environment specifies the denotation of each variable by a llst of variable-denotation pairs. The message is used for communication between network generating word specialists.</Paragraph> <Paragraph position="13"> A dictionary entity for each lexicon of LE contains a case pattern or an embedded word specialist program.</Paragraph> <Paragraph position="14"> Interpretation of the LE for each category In what follows we use linear notation of PSN beacuse of the space limitation, and we refer to the LE for each category simply by the category name.</Paragraph> <Paragraph position="15"> (i) Interpretation of a sentence The meaning of a simpie sentence is governed by the meaning of the verb. A dictionary entity for a verb includes a case pattern for the verb. According to the verb type, the case pattern looks like: intransitive verb: ((SUBJ, EXT, ... )), extensional transitive verb: ((ACTOR, EXT, ... ) (OBJ, EXT, ... )), intensional transitive verb: ((ACTOR, EXT, ... ) (OBJ, INT .... )), where, the first element of a case slot is a case label which is used only for distinguishing the slot, and the second element of a case slot indicates extensionality of the slot. If the slot indicates extensionality, the filler will be replaced by its extension. This manipulation will be treated later in this section.</Paragraph> <Paragraph position="16"> (2) Interpretation of a noun phrase Most significant noun phrase may be in the form, DET+NOUN. The formula is interpreted as follows: (a/an)+noun: %?P\[?P(INDEF\[?X; noun*(?X)\])\], the+noun: I?P\[?P(DEF\[?X; noun*(?X)\])\], every+noun: %?P\[ANY\[?X; noun*(?X)-~?P(?X)\]\], no+noun: %?P\[ANY\[?X;noun*(?X)~~?P(?X)\]\], where p* means the denotation of p.</Paragraph> <Paragraph position="17"> Personal pronouns is interpreted as follows: I: the SPEAKER attribute, you: the HEARER attribute, he: paraphrased as the male, she: paraphrased as the female.</Paragraph> <Paragraph position="18"> Proper name is interpreted as follows: proper-name: DEF\[?X; NAME('proper-name,?X)\]. (3) Interpretation of an adjective An adjective maps a noun into another noun. Here we treat those that plays this role.</Paragraph> <Paragraph position="19"> Interpretation of plural is: *pl(noun): I?X\[SUBSET(?X,SETOF\[?Y; noun*(?Y)\])\] i.e., *pl(noun) denotes a predicate which is true iff the argument is a subset of {X{ noun*(x)}.</Paragraph> <Paragraph position="20"> Adjectives are interpreted by word specialists embedded in the dictionary. A word specialist for an adjective examines the argument (a noun) and maps it into another noun. Thus the word specialist can handle de dicto readings of adjectives. For example, small(lion) ~ %?X\[LION(?X)&LESS-THAN(DEF\[?Y; SIZE(?X,?Y)\], average-size-of-lion)\].</Paragraph> <Paragraph position="21"> (4) Interpretation of a postmodification A relative clause (in restrictive use) maps the head noun into a modified noun, as follows: (which(sentence))(noun).</Paragraph> <Paragraph position="22"> A distinguished symbol 'which' announces the occurence of a relative clause and sends the denotation of the antecedent as a message. The argument of 'which' is a sentence including the eliminated noun phrase '#ante' which will receive the message and substitute the denotation. See the following example: the((which(l(%x\[#ante(ly\[attack(x,y)\])\]))) (problem)) &quot;the problem which I attack&quot; Interpreting the formula is:</Paragraph> <Paragraph position="24"> An adjective prepositional phrase also modifies a noun. An attributive noun or a de-verbal noun is treated as a noun which is a one-place predicate in LE, but which takes two or more arguments in PSN level. Adjective prepositional phrases supply these arguments to the head noun. For example, interpreting the LE: the(ly \[(the(car))(lx\[(((*ap(of))(x))(color))(y)\])\]), &quot;the color of the car&quot;, results in:</Paragraph> <Paragraph position="26"> Thus in the interpretation process, the message communications between specialists play a significant role.</Paragraph> <Paragraph position="27"> (5) Interpretation of a noun clause A space is used to denote the interpretation of a noun clause. A noun clause is interpreted as follows: I fun(sentence)~DEF\[?X; fun*(?X,ml)\], where T(~l,sentence*), where 'fun' stands for a symbol such as 'that', 'whether' ... etc. that maps a sentence into a noun clause. Fun* is an appropriate PSN predicate. T(~,p) is a meta predicate that means the object formula p is true in the possible world (or space) denoted by ~. For example, interpreting the LE: why(not((the(program))(lx\[work(x)\]))), &quot;why the program does not work&quot; results in: DEF\[?X; REASON(?X,m2)\], where T(~2,NOT(WORK(the-program*))).</Paragraph> <Paragraph position="28"> In this case, fun=why and fun*=REASON. The resuting denotation roughly reads &quot;the reason of the situation m2 and in ~2 the object referred to by the expression the(program) does not work.&quot; (6) Interpretation of other features - Possessive form is treated as a compound determiner. See the following example: &quot;the sentence is accepted by the automaton&quot;, is interpreted as follows: ACCEPT(DEF\[?X; AUTOMATON(?X)\], DEF\[?Y; SENTENCE(?Y)\]), where, *psubJ sends as a message the denotation of the deep subject, and *en receives the message to supply the OBJECT slot of the internal verb ACCEPT.</Paragraph> <Paragraph position="29"> Extensionin~ intenslonal structures An intensional PSN structure for a noun phrase is extensioned if the PSN structure is put into a case slot which indicates extensionality.</Paragraph> <Paragraph position="30"> An INDEF type PSN structure is replaced by an 1-unit (which denotes an individual constant). For example,</Paragraph> <Paragraph position="32"> &quot;I have a book.&quot; The intermediate PSN structure is: POSSESS(&quot;I&quot;,INDEF\[?X; BOOK(?X)\]). Since the OBJECT slot of the predicate POSSESS indicates extensionality, this becomes AND (POSSESS (&quot;I&quot;, C), BOOK(C) ), where C is a Skolem constant.</Paragraph> <Paragraph position="33"> For DEF type structure, since the denotation refers some uniquely determined object, a referent search program is activated. The program searches local contextual memory by matching each candidate against the given intensional PSN structure. The pattern matching operation in PSN corresponds to deduction on meta language, that is, the deflntion of match is: PSN. matches PSN^ if~ meta(PSNl) ~mplies meta(PSN2) In order to find the referent, various kinds of knowledge will be needed \[5\]. However, this topic is beyond the scope of this paper.</Paragraph> <Paragraph position="34"> The intensional PSN structure is replaced by a PSN structure found. For example, consider the following two sentences: This paper describes a system .... (i) The system analyzes programs .... (2) After the interpretation of the sentence (i), the local memory contains: DESCRIBE(A,B)&PAPER(A)&SYSTEM(B).</Paragraph> <Paragraph position="35"> For the sentence (2), the intermediate structure is: ANALYZE(DEF\[?X; SYSTEM(?X)\],programs*). After the referent search procedure, the structure becomes: ANALYZE(B,programs*).</Paragraph> <Paragraph position="36"> Since the denotation DEF\[?X; SYSTEM(?X)\] matches the node B (for, SYSTEM(B) holds), it is replaced by the node B.</Paragraph> </Section> <Section position="5" start_page="90" end_page="90" type="metho"> <SectionTitle> 5. Discussion </SectionTitle> <Paragraph position="0"> All the mechanisms presented so far has been implemented as LISP programs and are working on the personal LISP system in our laboratory. Now experiments and improvements are in progress.</Paragraph> <Paragraph position="1"> As stated in the first section, advantages of our method can be shown if it is applied to wide applications. Experiments are in progress as for machine translation and question answering.</Paragraph> <Paragraph position="2"> --91 Machine translation \[12\] As the first step to the machine translation, we are implementing a program which generates Japanese from the LE obtained by analyzing English. The generator program evaluates LE just the same way as the interpretation program does. This approach investigates the linguistic phenomena in analyzing and generating natural language. ~uestign answerin$ \[9\]~ \[i0\] Another application is to answer questions about the integrated network structure. In order to make conversation with a user, the input sentence should be further evaluated. For example, for user's question actual question/answering process must be invoked. Thus a pattern directed procedure is used. This approach investigates meaning representation and deduction.</Paragraph> <Paragraph position="3"> Extension to other languages \[ii\] The meaning representation is, in principle, independent of which language is used. To show this, we must analyze more than one languages. Although in this paper, the object language is English, we have implemented a Japanese parser and are in the course of implementation of Japanese to English machine translation program. Further work The important problems to be solved are: - the problem of discourse, especially, how to treat focus attention or ellipsis in our formalism, - the semantics of PSN; the semantics of PSN may be defined either by associating each network structure with a logic-oriented meta language or by defining inference rules on PSN explicitly; the semantics must explicate implications and synonyms among PSN structures; furthermore the semantics must be extended to treat the concepts such as action or event, - accommodation of transformational aspects; it seems that the transformational theory further decomposes the translation mapping T; the introduction of transformational aspect will increase the feasibility of the system.</Paragraph> </Section> class="xml-element"></Paper>