File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-2505_metho.xml
Size: 33,041 bytes
Last Modified: 2025-10-06 14:09:21
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2505"> <Title>Intentions, Implicatures and Processing of Complex Questions</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Question Complexity </SectionTitle> <Paragraph position="0"> Since 1999, the TREC QA evaluations focused on factoid questions, such as &quot;In what year did Joe Di Maggio compile his 56-game hitting streak?&quot; or &quot;Name a film in which Jude Law acted.&quot;. The answers to most of these questions belong to semantic categories associated with each question class. For example, questions asking about a date or a year can be answered because Named Entity Recognizers identify a temporal expression in a candidate text span. Similarly, names of people or organizations are provided as answers to questions such as &quot;Who is the first Russian astronaut?&quot; or &quot;What is the largest software company in the world?&quot;. Most Named Entity Recognizers detect names of PEOPLE, ORGANIZATIONS, LOCA-TIONS, DATES, PRICES and NUMBERS. For factoid Q/A, the list of name categories needs to be extended, as reported in (Harabagiu et al., 2003) for recognizing many more types of names, e.g. names of movies, names of diseases, names of battles. Moreover, the semantic categories of the extended set of names need to be incorporated into an answer type taxonomy that enables the recognition of (a) the expected answer type and (b) the question class. The taxonomy of expected answer types is useful because the answer is not always a name; it can be a lexicalized concept or a concept that is expressed by a paraphrase.</Paragraph> <Paragraph position="1"> The TREC evaluations have also considered two more classes of questions: (1) list questions and (2) definition questions. The list questions have answers that are typically assembled from different documents. Such questions are harder to answer than factoid questions because the systems must detect duplications. Example of list questions are &quot;Name singers performing the role of Donna Elvira in performances of Mozart's &quot;Don Giovani&quot;.&quot; or &quot;What companies manufacture golf clubs?&quot;. Definition questions require a different form of processing that factoid questions because no taxonomy of answer types needs to be used. The expected answer type is a definition, which cannot be represented by a single concept. Q/A systems assume that definitions are given by following a set of linguistic patterns that need to be matched for extracting the answer. Example of definition questions are &quot;What is a golden parachute?&quot; or &quot;What is ETA in Spain?&quot;.</Paragraph> <Paragraph position="2"> In (Echihabi and Marcu, 2003) a noisy channel model for Q/A was introduced. This model is based on the idea that if a given sentence SA contains an answer sub-string A to a question Q, then SA can be re-written into Q through a sequence of stochastic operators. Not only a justification of the answer is produced, but the conditional probability P(Q--SA) re-ranks all candidate answers.</Paragraph> <Paragraph position="3"> A different viewpoint of Q/A was reported in (Ittycheriah et al., 2000). Finding the answers A to a question Q was considered a classification problem that maximizes the conditional probability P(A--Q). This model is not tractable currently, because (a) the search space is too large for a text collection like the TREC or the AQUAINT corpora; and (b) the training data is insufficient. Therefore, Q/A is modeled by the distribution P(C--A,Q) where C measures the &quot;correctness&quot; of A to question Q. By using a hidden variable E that represents the expected answer type, P(C--A,Q) = SE p(C,E--Q,A) = SE p(C--E,Q,A) * p(E--Q,A). Both distributions are modeled by using the maximum entropy.</Paragraph> <Paragraph position="4"> All three forms of questions are also useful when processing complex questions, determined by a scenario resulting from a problem-solving situation. As illustrated in Figure 1, a scenario question may be associated with a pattern. One of the pattern variables represents the focus of the question. The notion of the question focus was first introduced by (Lehnert, 1978). The focus represents the most important concept of the question; a concept determining the domain of question. In the case of question Q1, the focus is missile program. The identification of the focus is based on the predicate-structure of the question pattern and on the order of the arguments. Figure 3 shows both the question pattern associated with Q1 and its predicate-argument structure. The argument with the role of purpose is ranked highest, and thus it determines the question focus.</Paragraph> <Paragraph position="5"> With the exception of the focus, all arguments from the predicate-argument structure may be used for generating definition questions. The focus is elaborated upon. Several forms of elaborations are possible. One is a temporal one, as illustrated in Figure 1. Other are resultative, causative or manner-based. For example, the knowledge that assistance in a missile program results in an inven-When did North Korea receive fromUSSR/Russiaassistance tory of missiles allows for resultative elaboration. Further knowledge needs to be coerced for generating the implied questions as possible follow-ups to intended questions.</Paragraph> <Paragraph position="6"> The relationship between intended questions and implied questions is marked by the presence of multiple references, e.g. the pronouns it and this or any and ones. The generation of implied questions is made possible by knowledge that is coerced from the intended questions. For example, when asking Qi1 :&quot;What is the USSR/Russia?&quot; the coercion process abstracts away from the concept that needs to be defined, i.e. a country. The implied question requests confirmation of the metonymy resolution involving USSR/Russia.This named entity may represent a country but most likely it refers to its government or, as Qm12 suggests, organizations or individuals acting on behalf of the country. Both Qm11 and Qm12 , implied questions derived from the intended question Qi1, refer to the metonymy by using the pronouns this and it respectively. Different forms of coercion are used for Qi3 because in this case the knowledge is associated with the predicate. The implied questions associated with the focus, i.e. the intended question Qi4, coerce the design and development predicates which are associated with the missiles as well as the timelines of possible additional assistance.</Paragraph> </Section> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Models of Question Answering </SectionTitle> <Paragraph position="0"> The processing of questions is typically performed as a sequence of three processes: (1) Question Processing; (2) Document Processing and (3) Answer Extraction. In the case of factoid questions , question processing involves the classification of questions with the purpose of predicting what semantic class the answer should belong to. Thus we may have questions asking about PEOPLE, ORGANIZATIONS, TIME or LOCATIONS. Since open-domain Q/A systems process questions regardless of the domain of interest, question processing must be based on an extended ontology of answer types. The identification of the expected answer type is based either on binary semantic dependencies extracted from the syntactic parse of the question (Harabagiu et al., 2001) or on the predicate-argument structure of the question. In both cases, the relation to the question stem (i.e. what, who, when) enables the classification. Figure 2 illustrates a factoid question generated as an intended question and the derivation of its expected answer type.</Paragraph> <Paragraph position="1"> However, many times the expected answer type needs to be identified from an ontology that has high lexico-semantic coverage. Many Q/A systems use the WordNet database for this purpose. In contrast, definition questions do not require the identification of the expected an- null has killed nearly 800 people since taking up arms in 1968 for Basque Homeland and Freedom [?] ETA, a Basque language acronym swer type, since they always request a definition. However, definition questions are matched against a set of patterns, which enables the extraction of the definition from the candidate answers. Figure 4 illustrates a definition question, the pattern it matched as well as the extracted answer.</Paragraph> <Paragraph position="2"> Both factoid and definition questions can be answered only if candidate passages are available. The retrieval of these passages is made possible by keywords that are selected from the question words. The Documents Processing module implements a search engine that returns passages that are likely to contain the expected answer type in the case of factoid questions or the definition pattern in the case of definition questions. The answer extraction module optimizes the extraction of the correct answer by unifying the question information with the answer information. The unification may be based on pattern matching; on machine learning algorithms based on the question and answer features or on abductive reasoning that justifies the answer correctness.</Paragraph> <Paragraph position="3"> Current state-of-the-art QA systems search for the candidate answer by assuming that the answers are single concepts, that can be recognized from a hierarchy or by a Named Entity Recognizer. This is a serious limitation, but it works well for the factoid, list or definition questions evaluated in TREC.</Paragraph> <Paragraph position="4"> The three modules of current Q/A systems reflect the three functions that need to be considered by any Q/A model: (1) understanding what the question asks; (2) identify candidate text passage that might contain the answer; and (3) the extraction of the correct answer. Currently, the expected answer type represents what question asks about: a semantic concept, e.g. the name of a person, location or organization, kinds of diseases, types of animals or plants. Generally these semantic concepts are lexicalized in a single word or in 2-word collocations. Clearly, this represents a limitation, since often the questions ask for more than a single concept. As we have seen in Table1, there is additional intended and implied information that is requested. Therefore new models of Question/Answering need to incorporate these additional forms of knowledge.</Paragraph> <Paragraph position="5"> When definition questions are processed in current Q/A systems, they are matched against a pattern, which is different from the question patterns associated with complex questions similar to those illustrated in Figure 1. In the case of a definition question like &quot;What is ETA in Spain?&quot;, the pattern identifies the question-point (QP) as ETA- the concept that needs to be defined and Spain as its context. The definition question pattern also contains several surface-form patterns that are matched in the candidate paragraphs. One such pattern is recognized in an apposition, by [QP, a AP] where AP represents the answer phrase. In the following passage: &quot;ETA, a Basque language acronym for Basque Homeland and Freedom - has killed nearly 800 people since taking up arms in 1968.&quot; the exact answer representing the definition is identified in AP: Basque language acronym for Basque Homeland and Freedom. The fact that Basque country is a region in Spain allows a justification of the question context.</Paragraph> <Paragraph position="6"> In this paper, by considering the intentional information and the implied information that can be derived when processing questions, we introduce a novel model of Q/A, which has access to rich semantic structures and enables the retrieval of more accurate answers as well as inference processes that explain the validity and contextual coverage of answers.</Paragraph> <Paragraph position="7"> Figure 5 shows the structure of the novel model of Q/A we propose. Both Question Processing and Document Processing have the recognition of predicate-argument structures as a crux of their models. As reported in (Surdeanu et al., 2003), the recognition of predicate-argument structures depends on features made available by full syntactic parses and by Named Entity Recognizers. As we shall show in this paper, the predicate-argument structures enable the recognition of question pattern, the question focus and the intentional structure associated with a question. When the intentions are known, the answer structure can be identified and the keywords extracted.</Paragraph> <Paragraph position="8"> For better retrieval of candidate answers, documents are indexed and retrieved based on the predicate-argument structures as well as on complex semantic structure associated with different question patterns. Similarly, the intentional structures are used for indexing/retrieving candidate passages. The Answer Processing function involves the recognition of the answer structure and intentional structure. Often this requires reference resolution. The implied information coerced from both the question and the candidate answer is also validated before deciding on the answer correctness.</Paragraph> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Predicate-Argument Structures </SectionTitle> <Paragraph position="0"> To identify predicate-argument structures in questions and passages, we have: (1) used the Proposition Bank or PropBank as training data; and (2) a mode for predicting argument roles similar to the one employed by (Gildea and Jurafsky, 2002).</Paragraph> <Paragraph position="1"> PropBank is a one million word corpus annotated with predicate-argument structures on top of the Penn Tree-bank 2 Wall Street Journal texts. For any given predicate, the expected arguments are labeled sequentially from Arg 0 to Arg 4. Generally, Arg 0 stands for agent, Arg 1 for direct object or theme or patient, Arg 2 for indirect object or benefactive or instrument or attribute or end state, Arg 3 for start point or benefactive or attribute and Arg4 for end point. In addition to these core arguments, adjunc null tative arguments are marked up. They include functional tags from Treebank, e.g. ArgM-DIR indicates a directional, ArgM-LOC indicates a locative, and ArgM-TMP stands for a temporal.</Paragraph> <Paragraph position="2"> An example of PropBank markup is: an eventual 30% state in the British Company ].</Paragraph> <Paragraph position="3"> The model of identifying the arguments of each predicate consists of two tasks: (1) the recognition of the boundaries of each argument in the syntactic parse tree; (2) the identification of the argument role. Each task can be cast as a separate classifier. Next section describes our approach based on Support Vector Machines (SVM) (Vapnik, 1995).</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Automatic Predicate-Argument extraction </SectionTitle> <Paragraph position="0"> Given a sentence in natural language, all the predicates associated with its verbs have to be identified along with their arguments. This problem can be divided in two subtasks: (a) detection of the target argument boundaries, i.e. all its compounding words, and (b) classification of the argument type, e.g. Arg0 or ArgM.</Paragraph> <Paragraph position="1"> A direct approach to learn both detection and classification of predicate arguments is summarized by the fol- null lowing steps: 1. Given a sentence from the training-set, generate a full syntactic parse-tree; 2. let P and A be the set of predicates and the set of parse-tree nodes (i.e. the potential arguments), respectively; null 3. for each pair <p;a>2P PSA: + extract the feature representation set, Fp;a; + if the subtree rooted in a covers exactly the words of one argument of p, put Fp;a in T+ (positive examples), otherwise put it in T! (negative examples).</Paragraph> <Paragraph position="2"> The above T+ and T! sets can be re-organized as positive T+argi and negative T!argi examples for each argument i. In this way, an individual ONE-vs-ALL SVM classifier for each argument i can be trained. We adopted this solution as it is simple and effective (Pradhan et al., 2003). In the classification phase, given a sentence of the test-set, all its Fp;a are generated and classified by each individual SVM classifier. As a final decision, we select the argument associated with the maximum value among the scores provided by the SVMs2, i.e. argmaxi2S Ci, where S is the target set of arguments.</Paragraph> <Paragraph position="3"> The discovering of relevant features is a complex task. Nevertheless there is a common consensus on the basic features that should be adopted. These standard features, first proposed in (Gildea and Jurafsky, 2002), are derived from parse trees as illustrated by Table 2.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Parsing Sentence into Predicate Argument Structures </SectionTitle> <Paragraph position="0"> For the experiments, we used PropBank (www.cis.upenn.edu/>>ace) along with Penn-TreeBank3 2 (www.cis.upenn.edu/>>treebank) (Echihabi and Marcu, 2003). This corpus contains about 53,700 sentences and a fixed split between training and testing which has been used in other researches (Gildea and Jurafsky, 2002; Surdeanu et al., 2003; Hacioglu et al., 2003; Chen and Rambow, 2003; Gildea and Hockenmaier, 2003; Gildea and Palmer, 2002; Pradhan et al., 2003). In this split, Sections from 02 to 21 are used for training, section 23 for testing and sections 1 and 22 as developing set. We considered all PropBank arguments from Arg0 to Arg9, ArgA and ArgM even if only Arg0 from Arg4 and ArgM contain enough training/testing special tags of noun phrases like Subj and TMP as parsers usually are not able to provide this information.</Paragraph> <Paragraph position="1"> data to affect the global performance.</Paragraph> <Paragraph position="2"> The classifier evaluations were carried out using the SVM-light software (Joachims, 1999) available at http://svmlight.joachims.org/ with the default polynomial kernel according to a degree d 2 f1;2;3;4;5g. The performances were evaluated using the F1 measure for both single argument classifiers and the multi-class classifier.</Paragraph> <Paragraph position="3"> - PHRASE TYPE (pt): This feature indicates the syntactic type of the phrase labeled as a predicate argument.</Paragraph> <Paragraph position="4"> - PARSE TREE PATH (path): This feature contains the path in the parse tree between the predicate phrase and the argument phrase, expressed as a sequence of nonterminal labels linked by direction (up or down).</Paragraph> <Paragraph position="5"> - POSITION (pos) Indicates if the constituent appears before or after the predicate in the sentence.</Paragraph> <Paragraph position="6"> - VOICE (voice) This feature distinguishes between active or passive voice for the predicate phrase.</Paragraph> <Paragraph position="7"> - HEAD WORD (hw) This feature contains the head word of the evaluated phrase. Case and morphological information are preserved.</Paragraph> <Paragraph position="8"> - GOVERNING CATEGORY (gov) This feature applies to noun phrases only, and it indicates if the NP is dominated by a sentence phrase (typical for subject arguments with active voice predicates), or by a verb phrase (typical for object arguments).</Paragraph> <Paragraph position="9"> - PREDICATE WORD In our implementation this feature consists of two components: (1) VERB: the word itself with the case and morphological information preserved; and (2) LEMMA which represents the verb normalized to lower case and infinitive form.</Paragraph> <Paragraph position="10"> Figure 6 illustrates the F1 measures for the overall argument extraction task (i.e. identification and classification) according to different polynomial degrees. Figure 6(a) illustrates the F1-performance of single classifiers for the arguments Arg0, Arg1 and ArgM. Figure 6(b) illustrates the performance for all the arguments (i.e. the multi-classifier). In general, we were able to recognize predicate argument structures with an F1-score of 80%.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.3 Using Predicate-Argument Structures in Question Answering. </SectionTitle> <Paragraph position="0"> Predicate-argument structures are useful for identifying candidate answers. Since they recognize long-distance dependencies between a predicate and one its arguments, they enable (1) the identification of the exact boundaries of an answer; and (2) they unify the predicate-argument relation sought by question with those recognized in candidate passages.</Paragraph> <Paragraph position="1"> Moreover, they are very useful in situations when the expected answer type of the question could not be recognized. There are two causes when the expected answer type cannot be identified: Case1: the answer class is a name that cannot be correctly classified by an available Named Entity Recognizer, because its class name is not encoded.</Paragraph> <Paragraph position="2"> Case2: the answer class cannot be found in the Answer Type hierarchy. The example from Figure 7 shows an instance of case 1. In this figure, the TREC question Q2054 has a predicate that can be unified with PREDICATES from the answer passage. The Arg1 of the predicate is the expected answer, which is identified as &quot;the Declaration of Independence&quot;. The Arg0 in the question is Button Gwinnett, whereas in the answer, it is underspecified, and should be resolved to who. This relative pronoun has Button Gwinnett as one of its antecedents.</Paragraph> <Paragraph position="3"> In Figure 8 the second case is illustrated. The question asked about the first argument of the predicate &quot;measure&quot;, when its Arg2 = &quot;a theodolite&quot;. In the answer, Predicate 2, with its infinite form, has as Arg 2 the same &quot;theodolite&quot;. However, the predicates are lexicalized by different verbs. In WordNet, the first sense of the verb &quot;measure&quot; as the verb &quot;determine&quot; as a hypernym, therefore Arg1 = &quot;wind speeds&quot; is the correct answer.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Intentional Structures </SectionTitle> <Paragraph position="0"> The correct interpretation of many questions requires the inference of implicit information, that is not directly stated in the question, but merely implied. The mechanisms of recognizing the intentions of the questioner are helpful means of identifying the implied information. For example, in the question QI:&quot;Will Prime Minister Mori survive the crisis?&quot;, the user does not literally mean &quot;Will Prime Minister Mori be still alive when the political crisis is over ?&quot;, but rather (s)he implies her/his belief that the current political crisis might cost the Japanese Prime Minister his job. It is very unlikely that any expert knowledge base covering Japanese politics will encode knowledge covering all situations of political crisis and the possible outcomes of the prime minister. However, this pragmatic knowledge is essential for the correct interpretation of the question.</Paragraph> <Paragraph position="1"> tures: Case 2 The design of advanced Question&Answering systems capable of grasping the intention of a professional analyst when (s)he poses a question depends both on the knowledge of the domain referred by the question as well as on a variety of rules and conventions that allow the communication of intentions and beliefs in addition to the literary meaning of the question. Access to domain knowledge is granted by a combination of retrieval mechanisms that bring forward relevant document passages from unstructured collections of documents, specialized knowledge bases and/or database access mechanisms. The research proposed in this project focuses on the derivation and usage of pragmatic knowledge that supports the recognition of question implications, also known as implicatures (cf.</Paragraph> <Paragraph position="2"> (Grice, 1975b)).</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 5.1 Intentional structures Derived from Lexico-Semantic Knowledge </SectionTitle> <Paragraph position="0"> The novel idea of this research is to link computational implicatures, similar to those defined by Grice (Grice, 1975b), to inferences that can be drawn from general lexico-semantic knowledge bases such as Word-Net of FrameNet. Incipient work was described in (Sanda Harabagiu and Yukawa, 1996), where a method of using lexico-semantic path for recognizing textual implicatures was presented. To our knowledge, this is the only computational model of implicatures that was developed and tested on a large lexico-semantic knowledge base (e.g. WordNet), enabling successful recognition of implicatures.</Paragraph> <Paragraph position="1"> The model proposed in (Sanda Harabagiu and Yukawa, 1996) uncovered a relationship between (a) the coherence of a text segment; (b) its cohesion expressed by the lexical paths and (c) the implicatures that can be drawn, mostly to account for pragmatic knowledge. This relationship can be extended across documents and across topics, to learn patterns of textual and Q&A implicatures and the methods of deriving knowledge that enables their recognition. null The derivation of pragmatic knowledge combines information from three different sources: (1) lexical knowledge bases (e.g. WordNet), (2) expert knowledge bases that can be rapidly formatted for many domains (e.g. Japanese political knowledge); and (3) knowledge supported from the textual information available from documents. The methodology of combining these three sources of information is novel. For question QI, the starting point is the concept identified as a cue for the expected answer type through methods described in (Harabagiu et al., 2000). This concept is lexicalized by the verb-object pair survive-crisis. Verb survive has four distinct senses in the WordNet 1.6 database, whereas noun crisis has two senses. The polysemy of the expected answer type increases the difficulty of the derivation of pragmatic knowledge, but it does not presupposes the word sense disambiguation of the expression. The information available in the glosses defining the WordNet synsets provides helpful information for expanding the multi-word term defining the expected answer type. By measuring the similarity between the two senses of the noun crisis and the words encountered as objects or prepositional attachments in the glosses of the various senses of the verb survive, we distinguish the noun adversity and the example cancer as expressing the closest semantic orientation to the first sense of noun crisis. The similarity is measured by counting the number of common hypernyms and gloss concepts of hypernyms of two synsets. Figure 9 illustrates the concepts related to the question QI, as derived from WordNet lexico-semantic knowledge base.</Paragraph> <Paragraph position="2"> The fact that surviving a political crisis has a dangerous component, indicated by the noun adversity, may also be supported by inferences drawn from an expert knowledge base, showing that a political crisis may be dangerous for political figures in power. However, at this point, the object of the dangerous situation is not specified. But several concepts indicating dangerous political situations can be inferred from the expert knowledge base and used in the query for text evidence. Only when text passages involving Prime Minister Mori are retrieved, clarifications of the situation are brought to attention: a vote of nonconfidence against the prime minister is considered. This new information helps inference from the expert knowledge base. The expert knowledge base modeling the Japanese factional politics confirms that this is a dangerous situation for the Prime Minister and that in fact his position is in jeopardy. Due to this inference from the expert knowledge base, the concept POSITION replaces noun existence from the gloss of the second sense of verb survive, and the pragmatic knowledge required for the interpretation of the implicature is assembled: The interactions between the three information sources derives the pragmatic knowledge on which relies the implication of the question. The user had an inherent belief that Prime Minister Mori might be replaced, and (s)he queries the Q&A system not only to find information but also to find support for his/her belief. The intentional structure is represented as a set of concepts and the relations that span them, as illustrated in Figure 9.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 5.2 Coercion of Intentions </SectionTitle> <Paragraph position="0"> A second method of deriving the intentional structure of a question is based on the predicate-argument structure that is derived from the question and the candidate answers.</Paragraph> <Paragraph position="1"> Figure 10 illustrates the Intentional Structure of one such question. The structure of the intentions is determined by the predicate-argument structure of the question and by its pattern. Generally, when asking whether X posses Y, we want to find (1) evidence of this fact; (2) we explore different means of finding the information; (3) we are interested in the source of information and (4) the enablers or inhibitors of finding the information as well as the consequences of knowing it are of interest. We assign a different index to each object from the predicate-argument structure, and do the same for each element of the intentional structure. For instance, in Figure 2, source(0) is interpreted as source(index=0) = source(evidence). Another feature of the intentional structure is determined by the coercions that are associated with both forms of indexed objects. For example, the coercion of evidence shows the most typical ways of finding evidence in the context of the topic of the question. Figure 2 lists such possibilities as (a) discovering, (b) stockpiling, (c) using and even (d) possessing. These possibilities are inserted in the context of the topic, since they make use of the indexes for associating meaning to their representations. In fact, option (a) discover(1,2,3) reads as discover(index=1, index=2, index=3) =discover(possesses(Iraq, biological weapons)).</Paragraph> <Paragraph position="2"> Whereas option (b) stockpile(2,3) can be similarly interpreted as stockpile(Iraq, biological weapons). Note that one of the indexed objects is the topic. The structure of the topic is define along three semantic dimensions: (1) hyponyms or examples of other types of the same category as the topic; (2) the meronyms or components; and (3) the functionality or the usage. The derivation of such a large set of intentional structures helped us learn how to coerce pragmatic knowledge. We have developed a probabilistic approach extending the metonymy work of (Lapata and Lascarides, 2003).</Paragraph> <Paragraph position="3"> Lapata and Lascarides report a model of interpretation of verbal metonymy as the point distribution P(e;o;v) of three variables: the metonymy verb v, its object, and the sought after interpretation i. For example a verb ! object relation that needs to be metonymycally interpreted, is enjoy ! movie. In this case v = enjoy, o = movie and i 2 fmaking;watching;directingg. The variables of the distribution re ordered as <i;v;o> to help factoring P(i;v;o) = P(i) C/ P(vji) C/ P(oji;v). Each of the probabilities P(i), P(vji) and P(oji;v) can be estimated using maximum likelihood. As it is illustrated in Figure 10, we have extended this model to account for: (1) coercion of topic information; (2) coercion of evidence of a fact; (3) interpretation of predicate and (4) interpretation of arguments. Since the verb ! object relation translates in one of the predicate-argument relations, we have coerced the predicate interpretations in the same way as (Lapata and Lascarides, 2003), but we allowed for any predicate-argument relation. Argument coercions were produced by searching the most likely predicates that used the same arguments. The topic model also incorporated topic signatures, similar to these reported in (E.H. Hovy and Ravichandran, 2002).</Paragraph> </Section> </Section> class="xml-element"></Paper>