File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/96/c96-2181_evalu.xml
Size: 7,298 bytes
Last Modified: 2025-10-06 14:00:20
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-2181"> <Title>NKRL, a Knowledge Representation Language for Narrative Natural Language Processing</Title> <Section position="5" start_page="1033" end_page="1034" type="evalu"> <SectionTitle> 4. Inferences and NL processing </SectionTitle> <Paragraph position="0"> Each of the four components of NKRL is characterised by the association with a class of basic inference procedures. For exmnple, the key inference mechanism for the factual component is the Filtering and Unification Module (FUM). The primary data structures handled by bq3M are the &quot;search patterns&quot; that represent the general properties of an information to be searched for, by filtering or unification, within a knowledge base of occun'ences. The most interesting component of tile FUM module is represented by the matching algorithm which unifies the complex structures -- like &quot;(SPECIF summoning_l (SPECIF board_meeting_l mediobanca_ special))&quot; in occurrence c2 of Fig. 1 -- that, in the NKRL terminology, are called &quot;structured arguments&quot;. Structured arguments are built up in a principled way by making use of a specialised sub-language which includes four expansion operators, the &quot;disjunctive operator&quot;, the &quot;distributive operator&quot;, the &quot;collective operator&quot;, and the &quot;attributive operator&quot; (SPECIFication), see (Zaxli, 1996) for more details.</Paragraph> <Paragraph position="1"> The basic inference mechanisms call then be used as building blocks for implementing all sort of high level inference procedures. An example is given by the &quot;transformation rules&quot;, see (Ogonowski, 1987). NKRL's transformations deal with the problem of obtaining a plausible answer from a database of factual occurrences also in the absence of the explicitly requested infommlion, by searching semantic affinities between what is requested and what is really present in file base. The fund,'unental principle employed is then to &quot;transform&quot; tile original query into one or more different queries which -- unlike &quot;trmisfonned&quot; queries in a database context -- are not strictly &quot;equivalent&quot; but only &quot;semantically closC' to the original one.</Paragraph> <Paragraph position="2"> With respect now to the NL/NKRL translation procedures, they are based oil file well-known principle of locating, within the original texts, the syntactic and semantic indexes which can evoke the conceptual structures used to represent these texts. Our contribution has consisted in tile set up of a rigorous algorithmic procedure, centred around the two foUowing conceptual tools : * The use of rules -- evoked by particular lexical items in the text exmnined and stored in proper conceptual dictionaries -- which take the form of generalised production rules. The left hand side (,antecedent Par0 is always a syntactic condition, expressed as a tree-like structure, which must be unified with the results of tile general parse tree produced by the syntactic specialist of the translation system. If the unification succeeds, tile right haud sides (consequent parts) ,are used, e.g., to generate well-formed templates Ctriggering rules&quot;).</Paragraph> <Paragraph position="3"> * The use, within file rules, of clever mechanisms to deal with the variables. For example, in the specific, &quot;triggering&quot; f,'unily of NKRL rules, the antecedent variables (a-variables) ,are first declared in tile syntactic (antecedent) part of the rules, and then &quot;echoed&quot; in tile consequent pro'is, where they appear under the form of arguments and constraints associated with the roles of the activated templates.</Paragraph> <Paragraph position="4"> Theh&quot; function is that of &quot;capturing&quot; -- during the match between file antecedents and the results of the syntactic specialist -- NL or H_CLASS terms to be then used as specialisation terms lot filling up the activated templates and building the final NKRL structures.</Paragraph> <Paragraph position="5"> A detailed description of these tools can be found, e.g., in (Zarri, 1995) ; see also Azzmn (1995). Their generality and their precise formal scmautics make it possible, e.g., tile quickly production of useful sets of new rules by simply duplicating and editing the existing ones.</Paragraph> <Paragraph position="6"> We reproduce now, Fig. 5, one of the several triggering rules to which tile lexical entry &quot;call&quot; -pertaining to tile NL fragment examined at the beginning of Section 3. -- contains a pointer, i.e., one of tile rules corresponding to the meaning &quot;to issue a call to convene&quot;. This rule allows the activation of a basic template (PRODUCE4.12) giving rise, at a later stage, to the occurrence c2 of Fig. 1 ; the x symbols in Fig. 5 correspond to a-variables.</Paragraph> <Paragraph position="7"> We can remark that all the details of the full template are not actually stored in the consequent, given that the H TEMP hierarchy is part of the &quot;common shared data stmctmes&quot; used by the translator. Only the par,'uneters relating to tile specific triggering rule ,'ue, therefore, really stored. For exmnple, in Fig. 5, the list &quot;eonstr&quot; specialises the constraints on some of the variables, while others -- e.g., the constraints on the v,'uiables xl (humanbeing/social_body) and x4 (planning_activity) -- are unchanged with respect to the constraints permanently associated with the variables of template PRODUCFA. 12.</Paragraph> <Paragraph position="8"> trigger: &quot;call&quot; syntactic condition: (s (subj (rip (noun xl))) (vcl (voice active) (t = x2 = call)) I II I II I m The &quot;standard&quot; prototype of an NL/NKRL translation system -- e.g., the COMMON LISP translator realised in the NOMOS project -- is a relatively fast system which take 3 min 16s on Sun SparcStafion 1 wifll 16Mb to process a inedium-size text of 4 sentences and 150 wordfonns ; it takes 1 min 06s for the longest sentence. This pure conceptual parser, however, is not suitable, per se, for dealing directly with huge quantifies of unrestricted data. In the COBALT project, we have then used a commercial product, TCS (Text Categorisation System, by Carnegie Group) to pre-select from a corpus of Reuters news stories those concerning in principle the chosen domain (financial news about merging, acquisitions, capital increases etc.). The candidate news items (about 200) have then been translated into NKRI, formal, and examined through a query system in order to i) confirm their relevance ; ii) exlract their main content elements (actors, circumstances, locations, dates, amounts of shares or money, etc.). Of the candidate news stories, 80% have been (at least partly) successfiflly translated ; &quot;at least p,'u'fly&quot; metals that, somethnes, the translation was incomplete due, e.g., to the difficulty of instantiating correctly some binding structures. Other quantitative information about the COBALT results can be found in (Azzmn, 1995 ; Zarri, 1995).</Paragraph> </Section> class="xml-element"></Paper>