File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/80/c80-1028_metho.xml
Size: 20,035 bytes
Last Modified: 2025-10-06 14:11:18
<?xml version="1.0" standalone="yes"?> <Paper uid="C80-1028"> <Title>LEVELS OF REPRESENTATION IN NATURAI, LANGUAGE BASED INFORMATION SYSTEMS AND THEIR RELATION TO THE METHODOI,OGY OF COMPUTATIONAL LINGUISTICS</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> LEVELS OF REPRESENTATION IN NATURAI, LANGUAGE BASED INFORMATION SYSTEMS AND THEIR RELATION TO THE METHODOI,OGY OF COMPUTATIONAL LINGUISTICS </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Summar L </SectionTitle> <Paragraph position="0"> In this paper the methodological basis of the 'computational linguistics approach' for representing the meaning of natural language sentences is investigated. Its adherance to principles of formal linguistics and formal philosophy of language like the 'separation of levels of syntactic and semantic analysis', and the &quot;Fregean&quot; principle may be contrasted with the 'artificial intelligence approach'. A &quot;Montague&quot; style method of mapping the syntax of natural language onto the syntax of the 'semantic language' used as the means of internal representation in the information system PLIDIS is presented. Rules for defining subsequent levels of representation like 'syntax-interpretative level', 'redundancy' free level' are given.</Paragraph> <Paragraph position="1"> Introduction The present paper presents ideas concerning a methodology of the 'semantics in computational linguistics' (COLsemantics). null There is the following hypothesis underlying: In the field of COL-semantics algorithms and computer programs are developed which deliver structures of linguistic analysis and representation that can be compared with those of formal linguistic semantics and satisfy the adequacy criteria of certain linguistic theories. They therefore are suitable instruments for developing and testing such theories.</Paragraph> <Paragraph position="2"> COL-semantics hence proceeds in a way different from the semantic processing as it is found in the framework of artificial intelligence (AI-semantics).</Paragraph> <Paragraph position="3"> AI-semantics is not so much linked to the semantics of formal linguistics or logic but rather to cognitive psychology, problem solving theory and the theory of knowledge representation which has been recently put forward within AI itself. 1 Between both branches of semantic processing of natural language that are realized in computer systems there therefore exists a difference in aims, theories and methods.</Paragraph> <Paragraph position="4"> Starting from a brief sketch ot the aims and theories of both approaches one essential methodological principle of COL-semantics will be elaborated in the second chapter of the paper. In the third chapter COL-semantic methods will be exemplified by a concrete application, the production of semantic representations in an information system. Stress will notbe laid on the question of w h a t a COL-semantic representation should look like but h o w levels of a semantic representation can be systematically relatedwith natural language and with each other.</Paragraph> <Paragraph position="5"> Aims and theoretical concept__ E of COL-semantics and AI-semantics The difference of aims and methods can only be outlined here as far as it is relevant with respect to the methodologica1 divergence which will be dealt with in detail: Aim of AI-semantics is the simulation of the human language understanding and/or language generating process that is to be understood as a manifestation of intelligent human problem solving behaviour. Aim of COL-semantics is the algorithmic generation of descriptive structures (of a generativ-semantic, interpretative, logico-semantic or other type) out of a given natural language input. Both purposes can be partial aims or intermediate steps within a larger project like 'simulation of dialogue behaviour', 'natural language information or question answering system'.</Paragraph> <Paragraph position="6"> Thus the AI-approach leads to a theory where the object of explanation (or simulation) is &quot;rational human behaviour ''2 or more specifically human language behaviour as a rational psychic process, whereas in the theory of linguistic semantics language is being objectified as a generated structure or a system which can be considered independently from the associated mental processes. In linguistic semantics and also in COL-semantics meta-linguistic notions which refer to language as a system like 'synonymy', 'equivalence' and (particularly in the formal linguistics based on logic)'truth' and 'entailment' are crucial; in AI-semantics however we have the 'behaviour' oriented conce~ts of 'inferencing','disambiguating', &quot;reasoning', 'planning' etc o A methodological principle of COL-semantics A distinctive feature of linguistics, especially logico-linguistic theories, is the separation of different &quot;expression&quot; and &quot;content&quot; levels of analysis and representation and the speci- null fication of mapping rules between them (surface structure versus deep structure, syntactic structure versus semantic structure). In Montague grammar this differentiation between a well defined syntactic level and an also well defined semantic level of description is a methodologically necessary consequence of the &quot;Fregean&quot; principle. The Fregean principle states that the meaning of an expression can be determined on the basis of the meanings of its logically simple constituent expressions and the syntactic structure of the whole expression. This principle has been revived by Montague and has ~eenrealized in his theory of language in such a way that the syntactic and the semantic structure of a natural language expression are respectively represented as expressions of formal systems (syntax and meaning algebras) between which systems there exist well defined formal relationships (homomorphisms).</Paragraph> <Paragraph position="7"> When this concept is transferred to the operationalizing of linguistic analysis in a computer system it will be excluded to conceive the mapping from natural language into semantic representation as a simple integrated pass, where in the course of parsing a sentence the valid semantic interpretation is assigned to each occurring item or group of items and where the possibilities of inference and association with stored background knowledge are flocally f realized without ever generating a full syntactic analysis.</Paragraph> <Paragraph position="8"> Saving an explicit level of syntactic representation seems to be compatible with the Fregean principle only under the condition that the algorithm incorporates a grammar (in the technical sense of a consistent set of generating or accepting syntactic rules), but for reasons of optimization directly associates or applies semantic 'values' or 'rules' in processing the corresponding syntactic 'nodes' or 'rules '4, or even allows a semantic control of rule selection without leaving the parsing mode. This condition however is mostly not maintained in AI parsing approaches where the one step processing is understood as a cognitively adequate analogue of human linguistic information processing and where even the terminal and non terminal symbols of the &quot;grammar&quot; are interpreted as semantic categories.5 Syntactic and semantic representation in an information system The way of processing natural language according to the principles of COL-semantics shall be demonstrated by the linguistic component of a natural language information system. The description is oriented at the application area and the structure of the system PLIDIS (information system for controlling industrial water pollution, developed at the Institut fuer deutsche Sprache, Mannheim). 6 Giving only the over all structure of the system we have the following processings and levels: morphological analysis of natural language input ~ syntactic analysis (level of syntactic representation) ~ transduction into formal representation language (level of semantic representation) interpretation (evaluation) against the database ~ answer generation The formal representation language is the language KS an extended first order predicate calculus, where the features going beyond predicate calculus are many sorted domain of individuals, lambda-abstraction and extended term building. 7 In the following two aspects of the semantic representation will be treated: - the mapping between syntactically analyzed natural language expressions and their KS counterparts will be investigated null - a differentiation between three levels of semantic representation will be accounted for: (level l) syntax-interpretative level, (level 2) canonical level, (level 3) database-related level, All three levels follow the same syntax, i.e. the syntax of KS and have the same compositional model theoretic semantics; they differ in their non logical constant symbols.</Paragraph> <Paragraph position="9"> _Mapping_natural language into the kemantic representation l~i!g~age KS In analogy with Montague's &quot;theory of translation&quot; in &quot;Universal Grammar&quot;we assume that the syntactic structures of natural language (NL, here German) and the semantic language (here KS) are similar, i.e. there exists a translation function f, such that the following holds: (l.l.) Given the categories of a categorial grammar of NL, f is mapping from these categories on the syntactic categories of KS. I.e. If m, ~I, ..., ~n are basic categories of German, then f(~), f (~I),..., f(#n) are syntactic categories of K$.</Paragraph> <Paragraph position="10"> If ~/~I/.../~n is a derived category (functor category) of NL, then f(~)/f(~1)/.../ f(~n) is a derived category of KS.</Paragraph> <Paragraph position="11"> (1.2.) If a is-an expression of category 6 in NL (a6), then f(a) is an expression of category f(6) in KS (f(a)f(6)). (1.3.) The concatenation of an expression of the derived category m/~I/.../ #nwithexpressions of category ~1,...,#nresulting in an expression of category</Paragraph> <Paragraph position="13"> with the category ~'(~)(concatenation and list construction are defined for categories instead of expressions in order to zmprove readability).</Paragraph> <Paragraph position="14"> Thus the 'transduction grammar' NL-KS is the triple < GNL, GKS, ~ > We now specify a minimal categoria\[ grammar of German GNL. A particular of GNL is the analysis of verbs as m-ary predicates, i.e. in the categorial frameworK, as functions from m NP into S 8 and the analogue treatment of nouns as functot categories 9 taking their attributes as arguments.</Paragraph> </Section> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> -~ FORMEL </SectionTitle> <Paragraph position="0"> By applying the function ~ we have got a grammar GKS for our semantic language KS in an inductive way. We now give the following lexical correspondence rules for some non logical expressions of NL, taken from the application area of PLIDIS.</Paragraph> <Paragraph position="1"> With the given syntactic and lexical rules we can generate the following level I representations of two natural language sentences: Enthielt die Probe bei tauxmann Arsen ? Did contain the sample from Lauxmonn arsenic ? (of polluted (name of a</Paragraph> <Paragraph position="3"> Meaning postulates for generating ~anonical representatlons Both sentences have received different representations on level I, they are nevertheless synonymous at least as far as the context of information seeking is concerned.</Paragraph> <Paragraph position="4"> An important principle in COL-semantics is the notion of structural (not lexical) synonymy. The following intuitively valid synonymy postulates (meaning postulates) can be formulated.</Paragraph> <Paragraph position="5"> holder&quot; attribute, under the precondition that the central noun of the NP systematically admitslOn+\] attributes : eine Probe is synonymous eine Probe bei with einem Betrieb ('a sample ('a sample of an of sewage industrial plant' )</Paragraph> <Paragraph position="7"> The application of this principle may be iterated.</Paragraph> <Paragraph position="8"> (2 There are verb classes the elements of which have no descriptive meaning (&quot;non-content verbs&quot;), in German the so called &quot;Funktionsverben&quot;, the copula segn and others). In such cases the NP as object or subject of the verb is the content bearer or 'principal' NP, e.e. it becomes the predicate of the proposition. Such a sentence is synonymous with a corresponding sentence containing a content verb equivalent in meaning to the content bearing NP. For example: arsenic.') In such a non-content verb proposition a noun phrase with a place holder attribute can also function as a &quot;second order&quot; principal NP, i.e. its unspecified attribute can be replaced by a &quot;filler&quot; NP, occurring as argument of the non-content verb: Arsengehalt liegt bei Lauxmann in der Probe vor. is synonymous with Die Probe bei Lauxmann enthZlt Arsen.</Paragraph> <Paragraph position="9"> Both postulates shall be applied for transducing the level \] representations of NL sentences into level 2 representations. We first give a definition of 'principal term', i.e. the KS construction corresponding to a 'principal NP'. (Def.) A principal term in a formula containing as PRAED the translation of anon content verb is a term that is capable, according to its semantic and syntactic structure, to embed other argument terms o~ the translation of the non content verb as its arguments.</Paragraph> <Paragraph position="10"> The operationalized version of the two principles is now after having shifted them onto the KS level: (1: maximality principle)When a NL-expression has n analysis (n ~ 2J in level \] which only differ in the number of arguments, then the level 2 representation consists of the 'maximal' level I expression, i.e. the expression containing the largest number of arguments. Any failing arguments are to be substituted by (existentially bound~ variables.</Paragraph> <Paragraph position="11"> (2: transformation principle) (2.1.) When the PRAED of a formula is the translation of a non-content verb, at least one of its arguments must be a principal term.</Paragraph> <Paragraph position="12"> (2.2.) A formula containing the translation of a non content verb must be transformed into an expression which contains the PRAED of a principal term as predicate iff there is an unambiguous mapping of the arguments of the translation of the non-content verb a) into arguments of a principal term or b) into a princapal term such that a well-formed formula of leve\] 2 is obtained.</Paragraph> <Paragraph position="13"> We now state that PROBE and ENTHALT are 'maximal' expressions and PROBEI and ENTHALTI must be mapped into them respectively and that further holds: VORLIEG is the translation of the non-content verb vorliegen PROBE is the PRAED of a second order principal term with respect to a 'plant' argument ENTHALT is the PRAED of a principal term with respect to a 'sample' argument null Then the two examples of level I are mapped into a single representation on level 2: \[ENTHALT\[JOTA\[LAMBDA x\[PROBE G-L XJ\]\]AS1\] The reduction of synonymous structures in the canonical level of representation meets the criteria of economy as they are necessary in a computer system. II As we have tried to show, however,it can be based upon general linguistic principles and need not be imputed to the field of &quot;world semantics&quot;. On the other side admitting paraphrases as natural language input (as our examples are) improves the systems &quot;cooperativeness&quot; towards the user. In PLIDIS special aspects of the world model are accounted for in the level 3 representations which mirror the relational structure of the data model to some extent. We can not go into the details of the relationship between level 2 and level 3 ~or reasons of space.</Paragraph> <Paragraph position="14"> Comparison with other approaches Language processing systems that are oriented at Montague grammar or model theoretic semantics are being developed among others by Friedman et al., Sondheimer and the PHLIQAI group. A theoretical discussion of the relationship between model theoretic semantics and AI-semantics can be found in Gunji and Sondheimer cf. also Hobbs and Rosenschein St. Bien and Wilks (witha contrary vlew). The methodological ideas presented here are most closely related with the approach of multi-level semantics pursued in PHLIQAi. But unlike the PHLIQAi approach we regard the level(sJ of linguistic representation not only under the more formal aspect of syntax interpretation but, as the last chapters show, we also take into account aspects of semantics of natural language word classes and structural synonymy.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> Notes </SectionTitle> <Paragraph position="0"> 1 There are certainly important interactions with empirial semantic work done in the last 10 years, soOrtony and Wilks stress the pervasive influence of Fillmore. Like any other systematic distinction the one between formal llnguistic semantics and AI-semantics is somewhat simplifying: Within AI there are semantic approaches which are more or less oriented at formal logic, so the one of McCarthy, Creary or Nash-Webber and Reiter and others. As typical AI-semantic approaches we regard the ones of Schank and his colleagues, Wi+-ks or Charniak (cf. for instance the articles in Charniak and Wilks).</Paragraph> </Section> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Hayes, 9 3 Slightly exaggerating this tendency </SectionTitle> <Paragraph position="0"> is formulated by Schank in Schank et al.):&quot;Researchers in NPL (natural language processing in AI) have become less and less concerned with language issues per se. We are more interested in inferencing and memory models for example.&quot; (p. 1OO8) 4 Such systems are presented for instance in Riesbeck, Norman and Rumelhart, and even more programmatically in Schank et al., DeJong. Also in systems conceived as data base interfaces like LIFER (Hendrix) and PLANES ~altz) &quot;semantic&quot;grammars are used. A theoretical discussion on the role of syntax can be found in Schank et al.</Paragraph> <Paragraph position="1"> 5 I.e. one has to check, whether in systems containlng only &quot;part grammars&quot; or working with a syntactic &quot;pre-processing&quot; the syntactic rules which were effectively used, can be combined resulting in a coherent and consistent grammar. Questions of syntactic-semantic and purely semantic grammars underlying parsers are also discussed from a theoretical point of view in Wahlster.</Paragraph> <Paragraph position="2"> The system PLIDIS is described in Kolvenbach, L6tscher and Lutz.</Paragraph> <Paragraph position="3"> The language KS (&quot;Konstruktsprache&quot;) is described in Zifonun.</Paragraph> <Paragraph position="4"> Cresswell gives an analogous categorial description for verbs. Like in this minimal grammar in applying the rule of concatenation phenomena of word order are neglected.</Paragraph> <Paragraph position="5"> Keenan and Faltz introduce the category of &quot;function noun&quot; (in our framework O-N/NP) 10 The vague condition of &quot;systematically admitting&quot; is made concrete in PLIDIS by prescribing a semantic&quot;sort&quot; for each argument of a predicate.</Paragraph> <Paragraph position="6"> ii This reduction is done in PLIDIS with the help of meaning postulates which are interpreted by a theorem prover.</Paragraph> </Section> class="xml-element"></Paper>