XML Viewer - m95-1021

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/95/m95-1021_metho.xml
Size: 42,809 bytes
Last Modified: 2025-10-06 14:13:59
<?xml version="1.0" standalone="yes"?>
<Paper uid="M95-1021">
  <Title>We correctly identified the following interesting cases : &amp;quot;Ms. Washington&amp;quot; , &amp;quot; Mr. York&amp;quot; and &amp;quot;Ms. Lansing&amp;quot; were not confused with locations</Title>
  <Section position="1" start_page="0" end_page="0" type="metho">
    <SectionTitle>
WAYNE STATE UNIVERSITY : DESCRIPTION of
</SectionTitle>
    <Paragraph position="0"> the UNO NATURAL LANGUAGE PROCESSIN G SYSTEM as USED for MUC- 6  ph . (313) 577-1667, fx : (313) 577-6868</Paragraph>
  </Section>
  <Section position="2" start_page="0" end_page="263" type="metho">
    <SectionTitle>
INTRODUCTIO N
</SectionTitle>
    <Paragraph position="0"> Our Research Hypothesi s The UNO natural language processing (NLP) system implements a Boolean algebra computational model of natural language [Iwadska, 1992] [Iwanska, 1993] [Iwanska, 1994] [Iwanska, 1996b] and reflects our researc h hypothesis that natural language is a very expressive, yet computationally tractable knowledge representation and reasoning system with its own representational and inferential machinery . One of our goals is to experimentally demonstrate that an NLP system that closely parallels the representational and inferentia l characteristics of natural language allows one to achieve an in-depth processing (eg ., querring automaticall y from texts created knowledge bases or entity classification), with close-to-real-time, high-recall-and-precisio n performance .</Paragraph>
    <Paragraph position="1"> Why and What for MUC- 6 MUC6 Tasks Not Inconsistent with Our Goal s We view the MUC6 tasks as not inconsistent with our goal of providing further experimental evidence i n support of our research hypothesis . Also, we have been conducting experiments in non-committal processing with controlled resource allocation such as exploiting local interpretations, taking advantage of loca l contexts, and performing expensive computations such as disambiguation only if needed . We have also been experimenting with flexible processing such as undoing various decisions and independent processin g of related tasks . The NE MUC-6 task is a good test for such experiments.</Paragraph>
    <Paragraph position="2"> Coreference-- Most Interesting Tas k We were particularly interested in the coreference task, with the NE task largely considered as a preparatio n stage for it because:  1. Reference resolution is critical for processing pretty much any text in any domain; 2. We believe that mechanism for resolving references to named entities is basically the same regardles s  of the entities' types . Identifying certain named entities and therefore resolving references to them may be facilitated by the availability of extra constraints on the beginning or the end of the name . For example, identifying personal and company names is facilitated by the fact that personal name s often start with a title such as &amp;quot;Ms .&amp;quot; and company names often end with a corporate extension suc h as &amp;quot;Ltd&amp;quot; .</Paragraph>
    <Paragraph position="3">  3. We are in the process of demonstrating that it is natural language for acquiring knowledge, and no t acquiring knowledge in order to process natural language . Our very encouraging preliminary results show that definite anaphora facilitates acquisition of taxonomic knowledge [Iwanska, 1996a], includin g the &amp;quot;type-subtype&amp;quot; and &amp;quot;part-subpart&amp;quot; relations. These results demonstrate that approaches to definite anaphora resolution that rely primarily on the availability of such knowledge, a standard approach i n all NLP systems we are aware of, are misguided .</Paragraph>
    <Paragraph position="4"> Successful Techniques for Developing and Testing In-Depth Processing Large Number of Texts Finally, by participating in MUC6 we hoped to share our experience and learn from the other participant s about successful techniques for developing and testing in-depth processing for a large, more than two thou sands, number of texts and for developing large knowledge bases .</Paragraph>
  </Section>
  <Section position="3" start_page="263" end_page="265" type="metho">
    <SectionTitle>
UNO MODEL OF NATURAL LANGUAG E
</SectionTitle>
    <Paragraph position="0"> Closely Mimics Representation and Reasoning Inherent in Natural Languag e The UNO model closely mimics the representation and reasoning inherent in natural language because it s translation procedure, data structures of the representation, and inference engine are motivated by th e semantics and pragmatics of natural language . One payoff of this close correspondence to natural languag e is the capability of automatically creating and querring knowledge bases from textual documents' .</Paragraph>
    <Paragraph position="1"> Sentences asserting properties of (sets of) individuals, sentences describing subtyping relations, includin g extentional type definitions, as well as intensional definitions of concepts such a s  1. &amp;quot;John is a neither good nor hard-working nurse &amp;quot; , &amp;quot;Not many students did welt' 2. &amp;quot;Dobermans, poodles and terriers are dogs &amp;quot; 3. &amp;quot;Elephant-- a huge, thick-skinned, mammal with very few hairs, with a long, flexible snout, and tw o ivory tusks&amp;quot; are uniformly represented by the following type equations : type == { &lt; P1i TP1 &gt;, &lt; P2,TP2 &gt;, . . ., &lt; Pn,TPn &gt; }  Their left-hand side, &amp;quot;type&amp;quot;, is the representation of a noun phrase, the largest type, or the name of a concept; the right-hand side is a two-element set: a property value P, and TP- a set of &lt; t, p &gt; elements representing the fact that the property value P holds at a temporal interval t with the probability p . Individual property values, temporal intervals and probabilities are represented by the sets [al , a2, . . ., an] whose elements a, are terms, record-like structures consisting of a head, a type symbol, and a body, a list of attribute-value pairs: attribute =&gt; value . For exr*nple, a complex noun &amp;quot;sick, very unhappy woman&amp;quot; is represented by [ woman( health =&gt; sick , happy =&gt; (not happy)(degree =&gt; very)) ] whose only term has the type &amp;quot;woman&amp;quot; as its head and two attributes : &amp;quot;health&amp;quot; with the value sick ; &amp;quot;happy&amp;quot; with the value (not happy)(degree =&gt; very) . Semantically, this data structure represents this subset o f individuals of the type &amp;quot;woman&amp;quot; for which the attribute &amp;quot;health&amp;quot; has the value &amp;quot;sick&amp;quot; and the function &amp;quot;happy&amp;quot; has the value &amp;quot;very unhappy' .</Paragraph>
    <Paragraph position="2"> Various relations such as entailment or its dual subsumption (set-inclusion) and negation (set-complement ) are automatically computed, and the intuitively and formally correct results are guaranteed to hold . The UNO NLP system uses such type equations bi-directionally : for answering questions about the properties of a particular individual, and for matching particular properties against the properties of individuals in it s knowledge base .</Paragraph>
    <Paragraph position="3"> 'Needless to say, limited by the subset of English covered by the model .</Paragraph>
    <Paragraph position="4">  Solid Computational and Mathematical Framework in Tact with Linguistic Theories null The UNO representation offers a solid computational and mathematical framework in tact with linguisti c theories. Updating knowledge base and automated inferencing is done by the same semantically clea n computational mechanism of performing Boolean operations on the representation of natural language inpu t and the representation of previously obtained information stored in the knowledge base . The underlying knowledge representation formalisms with the computable Boolean algebras with set theoretic and interval-theoretic semantics allows one to capture semantics of different syntactic categorie s because sets and intervals underlie semantics of many syntactic categories : common nouns, intransitive verbs , and adjectives can be thought of as denoting sets of persons or objects that possess properties denoted b y the words; adjectives and adverbs are functions mapping sets of objects into sets of objects ; determiners are functions mapping sets of objects into sets of sets of objects, and the denotations of proper nouns are set s of sets of objects [Dowty et al ., 1981] [Barwise and Cooper, 1981] [Keenan and Faltz, 1985] [Hamm, 1989] . The same machinery is used as a metalanguage for describing and propagating arbitrary Boolean constraints, including dictionary entries describing morphological and grammatical constraints . The data structures are partially specified, negative constraints are propagated via unification, and the nonmonotonicity of negation [Pereira, 1987] is not problematic.</Paragraph>
    <Paragraph position="5"> The UNO model shares many computational characteristics with the programming language LIF E [Ait-Kaci and Richard Meyer and Peter Van Roy, 1993] because the efficiently computable calculus that underlies LIFE [Kit-Kaci, 1986] is extended in the UNO model to handle negation and generalized quantifies . Some of the linguistic theories that the UNO model encompasses and (or) extends include insights of th e Montague semantics of natural language [Montague, 1973] [Dowty et al ., 1981], the Boolean algebra mathematical models of [Keenan and Faltz, 1985], the theory of generalized quantifiers [Barwise and Cooper, 1981 ] [Hamm, 1989], the theory of the pragmatic inference of quantity-based implicature of [Horn, 1972] [Horn, 1989] , and the theory of negation in natural language of [Horn, 1989] .</Paragraph>
    <Section position="1" start_page="264" end_page="265" type="sub_section">
      <SectionTitle>
Recent Extension-- Temporal Reasoning
</SectionTitle>
      <Paragraph position="0"> We have recently extended the UNO model to incorporate temporal reasoning . This extension demonstrates that important inferences about time can be captured by a general representation and reasoning mechanism inherent in natural language, many aspects of which are closely mimicked by the UNO model . We have shown that computing logical, context-independent and non-monotonic, context-dependent inferences fo r temporal and non-temporal objects is almost exactly analogous .</Paragraph>
      <Paragraph position="1"> Theory and Practis e We are committed to addressing research problems with a strong promise for facilitating processing natural language input . For example, we had decided to extenn the UNO model of natural language to handl e temporal information because virtually all real-life tasks involve handling some aspects of time . There is a large body of existing work on morphologically marked time and aspect, but we had decided against handling this type of temporal information because it necessarily requires high recall and precision of performin g sentential-level parsing, a task that no NLP system, including our system, can perform well . Instead, we had decided to address temporal information from explicit temporal expressions because we can extremel y reliably recover such expressions via local parsing .</Paragraph>
      <Paragraph position="2"> Our natural-language-based temporal reasoner was developed and tested on more than three hundred s 1989 &amp;quot; Wall Street Journal' articles. We somewhat randomly chose a batch of 300 WSJ articles and using SGML-like marks, an opening mark &lt; TE &gt; and a closing mark &lt; /TE &gt;, we had marked all expression s of different syntactic categories that contained any information pertaining to time . Incidentaly, the number of articles coincides with the number of the MUC-6 development data .</Paragraph>
      <Paragraph position="3"> Our temporal reasoner automatically extracts explicit temporal expressions from on-line textual documents and creates their representation. This representation allows the system to compute entailed logical, context-independent, deductive inferences and facilitates computing context-dependent, non-monotonic inferences, including implicature, specialization, and generalization.</Paragraph>
      <Paragraph position="4">  For any set of English temporal expressions, their information content can be computed and compared , which allows the system to compute answers to the &amp;quot;Yes-No&amp;quot; questions about various aspects of time , answers to the &amp;quot;When ?', &amp;quot;How long ?' and &amp;quot;How often ?' queries of the resulting knowledge base and, t o a limited extent, temporal ordering of the events described in the documents .</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="265" end_page="270" type="metho">
    <SectionTitle>
UNO NLP SYSTEM
</SectionTitle>
    <Paragraph position="0"> In this section, we briefly comment on the most distinct characteristics of our system, as requested by th e MUC-6 program committee ; technical details can be found in the references provided earlier . Whenever possible, we illustrate our system's capabilities with the examples from the walkthrough article .</Paragraph>
    <Paragraph position="1"> Demonstrated by the pre-MUC6 Research and Implementation Reasoning with Explicit Negative, Disjunctive, and Conjunctive Information at All Syntactic Levels One consequence of this rather unique capability is flat taxonomies; complex Boolean types need not be stored explicitly, which prevents the unnecessary, but common, exponential growth of a knowledge base . For example, the UNO system does not need to explicitly store the entailment (subsumption) relation betwee n the complex disjunctive type &amp;quot;power-boat or sail-boat' and the lexically-simple type &amp;quot;boat&amp;quot; because this relation is directly (and cheaply) computed from the UNO representation of these two expressions, commo n nouns in this case.</Paragraph>
    <Paragraph position="2"> Flat taxonomies are highly desirable because, among others, they facilitate the ease and quality increas e of knowledge base maintenance.</Paragraph>
    <Paragraph position="3"> Handling general negation in natural language allows the UNO system to correctly compute possibl e interpretations of the sentence &amp;quot;Mr. Dooner doesn't see a creative malaise permeating the agency&amp;quot;, one of which is that Mr. Dooner sees something else . This correct interpretation is automatically preferred by the system because of the context created by the immediately following sentence &amp;quot;He points to several campaigns with pride, including the Taster's Choice commercials that are like a running soap opera &amp;quot; .</Paragraph>
    <Paragraph position="4"> Adverbial and Adjectival Modification at All Syntactic Levels Adverbial and adjectival modification also contributes to the flatness of our taxonomies . For example, the</Paragraph>
    <Section position="1" start_page="265" end_page="266" type="sub_section">
      <SectionTitle>
Nonstandard Quantifiers
</SectionTitle>
      <Paragraph position="0"> Some of the nonstandard quantifiers that our system can handle include vague quantifiers involving th e determiner &amp;quot;many&amp;quot; . The UNO system automatically computes relations between the following expressions : &amp;quot;many differences&amp;quot; &amp;quot; very many differences&amp;quot; &amp;quot;not very many differences&amp;quot; Reasoning with Uncertainty and Qualitative Probabilistic Reasoning without Underlying Numeric Values Just like the other systems participating in MUC-6, our system often misses a high-level relation betwee n entities described in a sentence because our parser does not attempt or fails to compute full sentential-leve l parsing. However, even without such a quality parse, the system is capable of automatically computin g relations such as the relation between the following two expression s &amp;quot;possible acquisition? &amp;quot;possible, but not very likely, acquisitions&amp;quot;  While the lack of a quality parse may prevent the system from understanding a high-level relation, i n this case, what exactly was said about possible acquisitions, it understands the difference between &amp;quot;possible, but not very likely&amp;quot; and &amp;quot;possible&amp;quot; acquisitions .</Paragraph>
    </Section>
    <Section position="2" start_page="266" end_page="266" type="sub_section">
      <SectionTitle>
Temporal Reasoning
</SectionTitle>
      <Paragraph position="0"> The UNO system does not only identify explicit temporal expressions, but also automatically reasons wit h them . Some of the expressions uniquely handled by our system include the following :  1. Temporal frequency quantifiers &amp;quot;Not very often&amp;quot; &amp;quot;Very often&amp;quot; 2. Infinite number of temporal relations &amp;quot;Not in the immediate future&amp;quot; &amp;quot;immediately precedes &amp;quot; &amp;quot;Shortly before or after&amp;quot; &amp;quot;Not long before&amp;quot; &amp;quot;Five or sir days before &amp;quot; 3. Uncertain and underspecified (explicitly negative) temporal information  &amp;quot;It did not happen on April 22, 1992&amp;quot; &amp;quot;X happened in April, 1992, or X happened in May, but not early May, 1992&amp;quot;</Paragraph>
    </Section>
    <Section position="3" start_page="266" end_page="266" type="sub_section">
      <SectionTitle>
Reasoning with Underspecified Information
</SectionTitle>
      <Paragraph position="0"> The UNO system computes the fact that the following expressions differ in their information content an d that the first has strictly more information than the second : &amp;quot;X happened in May, 1992&amp;quot; &amp;quot;X happened in May, but not early May, 1992'</Paragraph>
    </Section>
    <Section position="4" start_page="266" end_page="270" type="sub_section">
      <SectionTitle>
Automatic Combination of Information
</SectionTitle>
      <Paragraph position="0"> Our system offers a highly efficient Boolean &amp;quot;meet&amp;quot; operation which is mathematically guaranteed to combine information in the most general way . This UNO operation provides an alternative to ad-hoc mergin g employed by many other systems . We must note, however, that currently this advantage of our system i s rarely realized in practise because it strongly depends on a correct, quality parse of input sentences .</Paragraph>
      <Paragraph position="1"> Uniform Natural-Language-Based Representation of Taxonomic and Temporal Reasoning As we explain later, this uniformity of our representation greatly simplifies our architecture and control .</Paragraph>
      <Paragraph position="2"> Work Done during June-October 199 5 Below we elaborate on the work done between the end of June and beginning of October 1995 . We briefly comment on the scope, importance and quantity of the tasks we had decided to do .</Paragraph>
      <Paragraph position="3"> 1. Implemented changes necessary to participate in MUC6 For example, before MUC-6, the UNO system was preserving the exact image of the input text, but i t did not keep track of the correspondence between the resulting knowledge base and the actual pieces o f input text from which the knowledge base resulted . For MUC-6, we had to redesign our bookkeepin g structures in order to be able to do this . We devoted a considerable effort to this task because such a change directly affects every processing stage .</Paragraph>
      <Paragraph position="4"> Another piece of code we had to develop were functions for chosing markables and outputting SGML tagged text .</Paragraph>
      <Paragraph position="5">  2 . Improved the accuracy of identifying unmarked sentential boundaries Our pre-MUG6 system was quite good in correctly identifying sentential boundaries in newspaper articles. A simple list of 200-or-so standard abbreviations and the sensitivity to the most commo n occurences of periods in numbers was largely responsible for this good performance . We wanted the system to perform this task near-perfectly because it would improve this generally needed capabilit y and because based on the existing literature, we expected the distance expressed in a number o f sentences to be a very important factor in computing pronominal referents .</Paragraph>
      <Paragraph position="6"> 3. Improved handling punctuation It is fair to say that before MUC-6, we largely ignored, as opposed to handled, punctuaction . Now we can say that we can handle periods, exclamations, most unpaired singlequotes, most commas, an d some dashes. While handling punctuation is good in general, specifically for MUC-6, it is needed fo r processing numbers and facilitating identification of apposites .</Paragraph>
      <Paragraph position="7"> 4. Completed half-built semantic hierarchy of structured geographical knowledge .</Paragraph>
      <Paragraph position="8"> The UNO NLP hierarchy of geographical knowledge contains major geographical information abou t all countries, including capital cities, major and important cities, towns, ports, suburbs, local settlements, geographical and political regions that divide land such as provinces, islands, major ports an d airports, landmarks, monetary, length, area, and volume systems, official languages, major politica l organizations, waters such as seas, lakes, and rivers, and geographical landmarks and points of interes t such as mountains, hills, woods, and national parks .</Paragraph>
      <Paragraph position="9"> This geographical knowledge is encoded in our uniform, general-purpose UNO knowledge representa tion; UNO NLP system supports geographical reasoning with its general inferencing mechanism .</Paragraph>
      <Paragraph position="10"> We certainly did not need to do this for MUC-6, a simple gazetteer list would do (an approach adopte d by most MUC-6 participants) . However, we put a lot of effort into encoding this hierarchy because : (a) With the existence of this hierarchy, we further substantiate our claim that natural language is a powerful and efficient knowledge representation system and add geographical knowledge to th e list of uniformly represented and reasoned about types of knowledge . Right now, our system can reason about geographical region containment in the exact analogous fashion as about type subse t relation and temporal interval containment.</Paragraph>
      <Paragraph position="11"> (b) We wanted to demonstrate that the same in-depth mechanism of some aspects of geographica l reasoning can be efficiently used to perform a much lesser, MUG6-like task of marking locations . 5. Identified common named types, including organization types and existing named entities We identified more than 100 types other than the type &amp;quot;organization&amp;quot; and developed sizable knowledge bases and dictionaries with the actual existing, classified named entities of these types . We decided t o do it in order to experimentally substantiate our belief that reference mechanism for named entities o f different types is basically the same for all entity types, and that references can be computed by the same piece of code .</Paragraph>
      <Paragraph position="12"> 6 . Implemented numbers and personal name s Our effort of making the UNO system process numbers and improving handling personal names wa s strictly related to MUC-6 .</Paragraph>
      <Paragraph position="13"> 7 . Developed and tested a general approach to handling abbreviations, acronyms and aliases We spent much more effort on abbreviations, acronyms and aliases than originally planned . First, such short forms are very common in written language. Second, handling such short forms resemble s handling semantic ambiguity. And third, short forms fit with our ongoing research on context .</Paragraph>
      <Paragraph position="14"> 8. Implemented quick-and-dirty pronoun resolutio n While this code appears to perform well, it breaks occasionally and needs to be further debugged .  is a simple and flexible architecture of our NLP system : 1. All UNO modules access the knowledge representation module and share its uniform representation . 2. There is no need for external specialists such as knowledge representation systems or temporal reasoners. Our system uniformly represents and reasons with taxonomic, temporal and geographical knowledge.</Paragraph>
      <Paragraph position="15"> 3. With no external specialists, no interfaces to access them are needed, and therefore there is no need t o  translate between incompatible representations.</Paragraph>
      <Paragraph position="16"> An NLP system that needs to perform tasks beyond information extraction and to exhibit some in-depth processing such as question answering virtually always calls some external specialists, typicall y knowledge representation systems . As reported in the literature, the necessity to translate between th e representation of the NLP system and such an external specialist is very hard to do and it tremendousl y complicates control [Palmer et al., 1993] [exp, 1996] .</Paragraph>
      <Paragraph position="17">  UNO NLP Module s The UNO NLP system consists of the following modules : Reader, Dictionary, Parser, Knowledge Representation, Discourse, and Learning .</Paragraph>
      <Paragraph position="18">  The first three, Reader, Dictionary, and Parser, are modules of the BILING system, a NI.P system processing a large number of narratives written by bilingual English/Spanish students [Iwaiiska, 1989] . The changes to these old modules include augmenting the parser to produce the UNO representation of sentences, enhancing the morphological analyzer to handle prefixes, and supplying the reader with structures for storing the information gained at various stages of processing . The Knowledge Representation module implements the theory behind the UNO model of natural language .</Paragraph>
      <Paragraph position="19"> The Reader module contains functions for breaking input text into documents, paragraphs, sentences, and words . It recognizes abbreviated phrases, contractions, punctuation etc . This module also contains functions for creating various structures from strings and LISP s-expressions, and routines for initializin g global variables used by other modules .</Paragraph>
      <Paragraph position="20"> The Dictionary module contains functions for creating, updating, loading and checking consistenc y of the UNO dictionary, and functions for performing morphological analysis of the input . Metaknowledg e about the dictionary describes its content : it lists known features, specifies feature applicability to differen t syntactic categories, describes possible and default values of different features (the default values are no t shown explicitly in the entries) . This metaknowledge facilitates maintaining consistency of the dictionary . The dictionary is used by the morphological analyzer for supplying each input word with syntactic , semantic, and pragmatic information . The morphological analyzer can recognize and generate various form s of nouns and verbs, for example, cry, cries, crying, cried, adjectives, for example, angrier, derive adverb s from adjectives, for example, slowly, etc. The morphological analyzer handles both prefixes, eg . the prefix im in the word impossible, and suffixes, eg. less in the word brainless .</Paragraph>
      <Paragraph position="21"> The Parser module contains functions implementing a chart parser [Winograd, 1983] [Earley, 1985] . The parser produces both syntactic parse trees and the UNO semantic representation of the natural languag e input . The grammar allows a limited context-sensitivity via features on lexical categories and non-terminals . Each grammar rule is supplied with the name of a function translating the recognized expressions of natura l language into the UNO representation .</Paragraph>
      <Paragraph position="22"> The Knowledge Representation module consists of the Boolean algebras module, Knowledg e Base Interpreter, and the Inference Engine module. The Boolean algebras module implements th e UNO knowledge representation formalisms and some standard Boolean algebras such as predicate calculus and the powerset of a finite set . The module contains functions for deriving the disjunctive normal form of a complex Boolean expression independently of its algebra, as well as functions for creating the representatio n of the element that this complex Boolean expression stands for .</Paragraph>
      <Paragraph position="23"> The Knowledge Base Interpreter implements the interpreter of the sets of type equations encoding taxonomic, temporal and geographical knowledge .</Paragraph>
      <Paragraph position="24"> The Inference Engine module implements the UNO algorithm for representing and utilizing knowledg e derived from natural language sentences. This algorithm updates the dynamic knowledge bases of the UN O system.</Paragraph>
      <Paragraph position="25"> The Discourse module implements anaphora resolution, functions for identifying referring expression s and computing referents if needed, and discourse processing, functions for computing certain discours e structures that facilitate maintaining dynamic knowledge bases .</Paragraph>
      <Paragraph position="26"> The Learning module consists of functions mixing statistics and inductive learning techniques and i s used for corpus analysis and definite-anaphora-based knowledge acquisition .</Paragraph>
      <Paragraph position="27"> Control Flexible, non-sequential control with all modules accessing the Knowledge Representation module . Speed Our system is reasonably fast . For MUG-6, it took slightly under half a minute to process a typical WS J article in the development set .</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="270" end_page="272" type="metho">
    <SectionTitle>
WALKTHROUGH ARTICLE AND MORE
</SectionTitle>
    <Paragraph position="0"> Somewhat Expected and Unexpected Problem s Given our extremely ambitious goals for such a short period of time, and particularly knowledge engineerin g large knowledge bases prevented us from doing coreference . We fully expected that we will not be able to successfully complete improving our existing anaphora resolution code and make it work reliably with th e newly created large knowledge bases .</Paragraph>
    <Paragraph position="1"> However, we did not expect a failure of the function printing out the markings . This buggy piece of code is about the easiest to fix, but at the same time, it is the most damaging in terms of the score . We had t o not print the recognized organization names and turn off processing of the &amp;quot;HL&amp;quot; and &amp;quot;DATELINE&amp;quot; parts of the articles.</Paragraph>
    <Paragraph position="2"> Our Official, Unofficial, and Very Unofficial Scor e Our scores are not very meaningful because despite the fact that the problematic markings constituted onl y a small percentage of all the markings, 8% for the first six test articles, the scorer failed to score anything , but four shortest articles ; this can be seen in the official results table below-- the &amp;quot;DD&amp;quot; slot containing a dat e in a canned &amp;quot;Month/Day/Year&amp;quot; format shows only four matches .</Paragraph>
    <Paragraph position="3"> In order to obtain an unofficial score, we had to edit our results, which we did strictly according to th e trace. Even this unofficial score does not reflect well our real performance, only shows that we did something . For those things that we were able to mark, our own very unofficial estimate is that we performed in th e high nineties in both recall and precision (as can be seen in the enclosed sample article) . For example, w e correctly marked both expressions problematic for most other systems : &amp;quot;the 21st century&amp;quot; and &amp;quot;Hollywood&amp;quot; .</Paragraph>
    <Paragraph position="5"> Below, we explain our edits and enclose the answer key for the walkthrough article with the hand-writte n markings showing the expressions we recognized as well as our edited markings .</Paragraph>
    <Paragraph position="6"> In the first 6 test articles, there were 16, or 8% problematic markings, and 186, or 92% of unproblemati c markings, with bits of text that was not there appearing and portions of teh existing text disappearing :  reti chairman at the end of the year . He will be succeeded byMr + NAMEX TYPE='PERSON*&gt;Dooner&lt;/ENAMEX&gt;, 66 . &lt;0&gt;It promises to be a smooth process, which I. unusual/~[n th e volatile atmosphere of the advertising business . But M ~ + ~ ENAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;Dooner&lt;/ENAMEX&gt; hasa big challenge that will be his top priority . 'I'm going oreson strengthening the creative work,&amp;quot; he says . 'There is room t o grow. We can make further improvements in terms of the perception of our creative work .'&lt;/9&gt;  creative assignment for the prestigious Coca-Coll ~~Classic account. *I would be less than honest to say I'm n o dis sated not to be able to claim creative leadership to</Paragraph>
  </Section>
  <Section position="6" start_page="272" end_page="274" type="metho">
    <SectionTitle>
NAMEX TYPE=&amp;quot;ORGANIZATION* STATUS=*OPT&amp;quot;&gt;Coke&lt;/ENAMEX&gt;, 'M + ENAMEX TYPE=&amp;quot;PERSON*&gt;Dooner&lt;/PSNAME says .
&lt;0 NAMEX TYPE=*ORGANIZATION&amp;quot;&gt;McCann&lt;/ENAMEX&gt; still handles promotions and media buying fo r
NAMEX TYPE=*ORGANIZATION* STATUS=&amp;quot;OPT*&gt;Cohr
</SectionTitle>
    <Paragraph position="0"> AMEX&gt;'s ubiquitous *Av e advertising belongs t oNAMEX TYPE=&amp;quot; ROANIZATION&amp;quot;&gt;Coke&lt;/ENAMEX&gt; . Buthe bragging rights tNAMEX TYPE= OANIZATION*&gt;Creative Artists Agency&lt;/ENAMEX&gt;, the hi + NAMEX TYPE='LOCATION&amp;quot;&gt;Hollywood&lt;/ENAMEX &gt;- alent agency. 'We ar e sving to have a strong renewed creative partnership wit hNAMEX TYPE='ORGANIZATION'&gt;Coca-Cola&lt;/ENAMEX&gt;,' M4'+jPSNAMEX TYPE=&amp;quot;PERSON*&gt;Dooner&lt;/ENAMEX&gt; says. However, odds of that apposing areslim sin word fro  ye now that be has 'r rated^ himself, he weals to d othe same for the agency . Por M + NAMEX TYPE=&amp;quot;PERSON*&gt;Dooner&lt;/ENAMEX&gt;, it mesas maintaining his running and enercise schedule, a r the agency, it mean sdeveloping mo global campaigns that nonetheless reflect loca l cultures . On NAMEX TYPE=&amp;quot;ORGANIZATION*&gt;McCana&lt;/ENAMEX&gt; account, 'I Can't Believe It's Not Butter,' abatter subst e, is in 11 countries, for example.</Paragraph>
    <Paragraph position="1"> &lt;/9&gt; NAMEX TYPE='ORGANIZATION&amp;quot;&gt;McCann&lt;/ENAMEX&gt; has initiated a new so-called global collaborative system .mposed of world-w' ccuunt directors paired with creativ ertners . In additio + NAMEX TYPE=*PERSON*&gt;Peter Kim&lt;/E MEX&gt; was hired fro m  up for the headaches of running one o fthe Is d-wide agencies . (There are no immediate plans toreplace M + ENAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;Dooner&lt;/ENAMEX&gt; as president ; M + NAMEX TYPE=*PERSON'&gt;James&lt;/ENAMEX&gt; operate d ae chairmaet executive officer and president for a period of time.) Mr.</Paragraph>
    <Paragraph position="2"> + ENAMEX TYPE='PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt; is filled with thoughts of enjoying his three hobbies, sailing, skiing and hunting.9&gt; &lt;0&gt;Asked why,,would choose to voluntarily exit while he still to so young, M E~EX TYPE=&amp;quot;PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt; says it is time to be a tad selfish about how h espends his d + ENAMEX TYPE_'PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt;, who has a reputation sa asextraordinarily ton kmaster, says that because he &amp;quot;had a greattime. in advertising.' he doesn't want to 'talk about the disappoi meats .' In fact, when he is asked his opinion of the ne wbat - NAMEX TYPE=*ORGANIZATION* STATUS=*OPT*&gt;Coke&lt;/ENAMEX&gt; ads from ENAMEX TYPE='ORGANIZATION&amp;quot;&gt;CAA&lt;/ENAMEX&gt; ,M MEX TYPE=&amp;quot;PERSON*&gt;James&lt;/ENAMEX&gt; places his hands over hi smo . He shrugs. He doep t utter a word . He has, he says, fon dmemories of working wit1 ENAMEX TYPE=&amp;quot;ORGANIZATION*&gt;Coke&lt;/ENAMEX&gt; executives  Soon . Mr t NAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt; will be able to compete in as many sailing race s as a hon .</Paragraph>
    <Paragraph position="3"> nd concentrate on his duties as rear commodore a tth  It promises to be a smooth process, which is unusual given the volatile atmosphere of the advertising business . But Mr . &lt;ENAMEX TYPE=&amp;quot;PERSON*&gt;Dooner&lt;/ENAMEX&gt; has* big challenge that will be his top priority . &amp;quot;I'm going to focu s on strengthening the creative work,&amp;quot; he says . &amp;quot;There is room t ogrow . We can make further improvements in terms of the perception o f  loss of the key creative assignment for the prestigious Coca-Col aClassic account . '1 would be less than honest to say I'm no t disappointed not to be able to claim creative leadership for Coke, 'Mr  McCann still handles promotions and media buying for Coke . Bu t the bragging rights to Coke's ubiquitous advertising belongs t o Creative Artists Agency, the big &lt;ENAMEX TYPE=*LOCATION&amp;quot;&gt;Hollywood&lt;/ENAMEX&gt; talent agency . 'We are striving to have a strong renewed creative partnership with Coca-Cola&amp;quot; Mr . &lt;ENAMEXTYPE=degPERSON*&gt;Dooner&lt;/ENAMEX&gt; says. However, odds of that happening are slim since word from Coke headquarter. in &lt;ENAMEX TYPE=*LOCATION*&gt;Atlanta&lt;/ENAMEX&gt; is that CAA an d other ad agencies, such as Fallon McElligott, will continue t o  months&lt;/TIMEX&gt;, says now that he has &amp;quot;reinvented&amp;quot; himself, he wents to d o the same for the agency. For Mr. &lt;ENAMEX TYPE='PERSON'&gt;Dooner&lt;/ENAMEX&gt;, it means maintaining hi s running and exercise schedule, and for the agency, it mein .</Paragraph>
    <Paragraph position="4"> developing more global campaigns that nonetheless reflect localcultures . One McCann account, *1 Can't Believe It's Not &lt;ENAMEX TYPE=&amp;quot;PERSON'&gt;Bntter&lt;/ENAMEX&gt; ' a butter substitute, is in 11 countries, for example .</Paragraph>
    <Paragraph position="5"> &lt;/p&gt; &lt;0 &gt;McCann has initiated a new so-called global collaborative system , composed of world-wide account directors paired with creative partnen. In addition, &lt;ENAMEX TYPE=*PERSON*&gt;Peter Kim&lt;/ENAMEX&gt; was hired from WPP Group's J .  He points to several campaigns with pride, including the Taster' s Choice commercials that are like a running soap opera . &amp;quot;It's a &lt;NUMEX TYPE=&amp;quot;MONEY&amp;quot;&gt;$19&lt;/NUMEX &gt; million campaign with the recognition of a &lt;NUMEX TYPE='MONEY*&gt;S200 million&lt;/NUMEX&gt; campaign, &amp;quot; he ways of the commercials that feature a couple that must hold a  record for the length of time dating before kissing .</Paragraph>
    <Paragraph position="6">  officer of Ammirati &amp; Puri., about McCann's acquiring the agenc y with billings of &lt;NUMEX TYPE=*MONEY&amp;quot;&gt;$400 million&lt;/NUMEX&gt;, but aothiog has materialised . &amp;quot;Ther e is no question,' says Mr . &lt;ENAMEX TYPE=*PERSON&amp;quot;&gt;Dooner&lt;/ENAMEX&gt;, 'that we are looking for qualit y acquisitions and Ammirati &amp; Paris is a quality operation . There are some people and entire agencies that I would love to see be part of the McCann family .' Mt . &lt;ENAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;Dooner&lt;/ENAMEX&gt; declines to identify possibl e acquisitions .</Paragraph>
    <Paragraph position="7"> &lt;/P &gt; &lt;0&gt;Mr. &lt;ENAMEX TYPE=&amp;quot;PERSON'&gt;Dooner&lt;/ENAMEX&gt; is just gearing up for the headaches of running one o f the largest world-wide agencies . (There are no immediate plena to replace Mr . &lt;ENAMEX TYPE .&amp;quot;PERSON&amp;quot;&gt;Doonet&lt;/ENAMEX&gt; as president ; Mr . &lt;ENAMEX TYPE=*PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt; operate d as chairman, chief executive officer and president for a period of time .) Mr . &lt;ENAMEX TYPE='PERSON&amp;quot;&gt;llmes&lt;/ENAMEX&gt; is filled will thoughts of enjoying his three hobbies : . ..Wag, skiing and hunting . &lt;/p &gt; &lt;0 &gt; Asked why he would choose to voluntarily exit while he still i s so young. Mr . &lt;ENAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt; says It is time to bee tad selfish about bow h e spends his deys. Mr. &lt;ENAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt;, who has a reputation as a n extraordinarily tough taskmaster, nays that because he *had a grea t time&amp;quot; is advertising,&amp;quot; he doesn't want to &amp;quot;talk about th e disappointments .' In fact, when he is asked his opinion of the new batch of Coke eds from CAA, Mr . &lt;ENAMEX TYPE=*PERSON'&gt;James&lt;/ENAMEX&gt; places his hands over hi s mouth . He shrugs . He doesn't utter ..ord . He has, he says, fond memories of working with Coke executives . &amp;quot;Coke has given os grea t highs,* says Mr . &lt;ENAMEX TYPE=&amp;quot;PERSON&amp;quot;&gt;James&lt;/ENAMEX&gt;, sitting in his pluck office, filled wit h photographs of sailing es well as huge models of, among other tkings, a Dutch tugboat .</Paragraph>
  </Section>
  <Section position="7" start_page="274" end_page="275" type="metho">
    <SectionTitle>
OP &gt;
&lt;0 &gt;
</SectionTitle>
    <Paragraph position="0"> He says he feels a &amp;quot;great sense of accomplishment .&amp;quot; In 36 countries, McCann is tanked in the top three ; in 75 countries, it is in the top 10.</Paragraph>
    <Paragraph position="1">  Our very good performance on the task of identifying temporal expressions was sligthly improved wit h handling numbers . Previously we were missing &amp;quot;bare&amp;quot; numeric years such as &amp;quot;the 1980 election&amp;quot; , &amp;quot; 1980s&amp;quot; , &amp;quot;pre-1970' , and dates such as &amp;quot;4.12&amp;quot; (as referring to &amp;quot;April 12-th&amp;quot; ). Our errors included : &amp;quot; 57 years old&amp;quot; (probably misinterpreted task definition), &amp;quot;last year&amp;quot; (deliberatel y kept) and &amp;quot;three-and-a-half months&amp;quot; (only some dashes handled) .</Paragraph>
  </Section>
  <Section position="8" start_page="275" end_page="275" type="metho">
    <SectionTitle>
LOCATION S
</SectionTitle>
    <Paragraph position="0"/>
    <Paragraph position="2"> Good performance on identifying locations stems from the combination of our rather complete critica l knowledge bases with the major named types and geographical information for all countries and automati c interpretation of locative expressions with a known geographical type such as &amp;quot;the city of Farmington Hills&amp;quot; and &amp;quot; The Isle of Man&amp;quot; , and expressions of the form : &amp;quot;Smaller Region, Larger Region &amp;quot; such as &amp;quot;Poland, New York&amp;quot; .</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML