File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/92/h92-1047_metho.xml
Size: 16,393 bytes
Last Modified: 2025-10-06 14:13:07
<?xml version="1.0" standalone="yes"?> <Paper uid="H92-1047"> <Title>The Acquisition of Lexical Semantic Knowledge from Large Corpora</Title> <Section position="3" start_page="0" end_page="244" type="metho"> <SectionTitle> 2. Projecting Syntactic Behavior from </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="243" type="sub_section"> <SectionTitle> Deep Semantic Types </SectionTitle> <Paragraph position="0"> The purpose of the research is to experiment with automatic acquisition of semantic tags for words in a sublanguage, tags which are well beyond that available from the seeding of MRDs. The identification of semantic tags for a word associated with particular lexical forms (i.e.</Paragraph> <Paragraph position="1"> semantic collocations) can be represented as that part of the lexical structure of a word called the projective conclusion space (Pustejovsky (1991))).</Paragraph> <Paragraph position="2"> For this work, we will need to define several semantic notions. These include: type coercion, where a lexical item requires a specific type specification for its argument, and the argument is able to change type accordingly --this explains the behavior of logical metonymy and the syntactic variation seen in complements to verbs and nominals; cospecification, a semantic tagging of what collocationM patterns the lexical item may enter into; and contextual opacity/transparency, which characterizes of a word just how it is used in particular contexts. Formally, we will identify this property with specific cospecification values for the lexical item (cf. Pustejovsky (forthcoming)). null Metonymy, in this view, can be seen as a case of the &quot;licensed violation&quot; of selectional restrictions. For example, while the verb announce selects for a human subject, sentences like The Dow Corporation announced third quarter losses are not only an acceptable paraphrase of the selectionally correct form Mr. Dow arr. announced third quarter losses for Dow Corp, but they are the preferred form in the corpora being examined (i.e.</Paragraph> <Paragraph position="3"> the ACL-DCI WSJ and TIPSTER Corpora). This is an example of subject type coercion, where the semantics for Dow Corp. as a company must specify that there is a human typically associated with such official pronouncements (Bergler (forthcoming)).</Paragraph> </Section> <Section position="2" start_page="243" end_page="244" type="sub_section"> <SectionTitle> 2.1. Coercive Environments in Corpora </SectionTitle> <Paragraph position="0"> Another example of type coercion is that seen in the complements of verbs such as begin, enjoy, finish, etc. That is, in sentences such as &quot;John began the book&quot;, the normal complement expected is an action or event of some sort, most often expressed by a gerundive or infinitival phrase: &quot;John began reading the book&quot;, &quot;John began to read the book&quot;. In Pustejovsky (1991) it is argued that in such cases, the verb need not have multiple subcategorizations, but only one deep semantic type, in this case, an event. Thus, the verb 'coerces' its complement (e.g.</Paragraph> <Paragraph position="1"> &quot;the book&quot;) into an event related to that object. Such information can be represented by means of a representational schema called qualia structure, which, among other things, specifies the relations associated with objects. null In related work being carried out with Mats Rooth of ATT, we are exploring what the range of coercion types is, and what environments they may appear in, as discovered in corpora. Some of our initial data suggest that the hypothesis of deep semantic selection may in fact be correct, as well as indicating what the nature of the coercion rules may be. Using techniques described in Church and Hindle (1990), Church and Hanks (1990), and Hindle and Rooth (1991), below are some examples of the most frequent V-O pairs from the AP corpus.</Paragraph> <Paragraph position="2"> Counts for &quot;objects&quot; of begin/V: Corpus studies confirm similar results for &quot;weakly intensional contexts&quot; (Pustejovsky (1991)) such as the complement of coercive verbs such as veto. These are interesting because regardless of the noun type appearing as complement, it is embedded within a semantic interpretation of &quot;the proposal to&quot;, thereby clothing the complement within an intensional context. The examples below with the verb veto indicate two things: first, that such coercions are regular and pervasive in corpora; secondly, that almost anything can be vetoed, but that the most frequently occurring objects are closest to the type selected by the verb.</Paragraph> <Paragraph position="3"> What these data &quot;show is that the highest count complement types match the type required by the verb; namely, that one vetoes a bill or proposal to do something, not the thing itself. These nouns can therefore be used with some predictive certainty for inducing the semantic type in coercive environments such as &quot;veto the expedition.&quot; This work is still preliminary, however, and requires further examination (Pustejovsky and Rooth (in preparation)). null</Paragraph> </Section> </Section> <Section position="4" start_page="244" end_page="246" type="metho"> <SectionTitle> 3. Implications for Natural Language </SectionTitle> <Paragraph position="0"> The framework proposed here is attractive for NLP, for at least two reasons. First, it can be formalized, and thus make the basis for a computational procedure for word interpretation in context. Second, it does not require the notion of exhaustive enumeration of all the different ways in which a word can behave, in particular in collocations with other words. Consequently, the framework can naturally cope with the 'creative' use of language; that is, the open-ended nature of word combinations and their associated meanings.</Paragraph> <Paragraph position="1"> The method of fine-grained characterization of lexical entries, as proposed here, effectively allows us to conflate different word senses (in the traditional meaning of this term) into a single meta-entry, thereby offering great potential not only for systematically encoding regularities of word behavior dependent on context, but also for greatly reducing the size of the lexicon. Following Pustejovsky and Anick (1988), we call such meta-entries lexical conceptuM paradigms (LCPs). The theoretical claim here is that such a characterization constrains what a possible word meaning can be, through the mechanism of logically well-formed semantic expressions. The expressive power of a KR formalism can then be viewed as simply a tool which gives substance to this claim.</Paragraph> <Paragraph position="2"> The notion of a meta-entry turns out to be very useful for capturing the systematic ambiguities which are so pervasive throughout language. Among the alternations captured by LCPs are the following (see Puste- null jovsky (forthcoming) and Levin (1989)): 1. Count/Mass alternations; e.g. sheep.</Paragraph> <Paragraph position="3"> 2. Container/Containee alternations; e.g. bottle.</Paragraph> <Paragraph position="4"> 3. Figure/Ground Reversals; e.g. door, window.</Paragraph> <Paragraph position="5"> 4. Product/Producer diathesis; e.g. newspaper, IBM, Ford.</Paragraph> <Paragraph position="6"> For example, an apparently unambiguous noun such as newspaper can appear in many semantically distinct contexts. null 1. The coffee cup is on top of the newspaper.</Paragraph> <Paragraph position="7"> 2. The article is in the newspaper.</Paragraph> <Paragraph position="8"> 3. The newspaper attacked the senator from Massachusetts. null 4. The newspaper is hoping to fire its editor next month.</Paragraph> <Paragraph position="9"> This noun falls into a particular specialization of the Product/Producer paradigm, where the noun can logically denote either the organization or the product produced by the organization. This is another example of logical polysemy and is represented in the lexical structure for newspaper explicitly (Pustejovsky (1991)). Another class of logically polysemous nominals is a specialization of the process/result nominals such as merger, joint venture, consolidation, etc. Examples of how these nominals pattern syntactically in text are given below: 1. Trustcorp Inc. will become Society Bank 8J Trust when its merger is completed with Society Corp. of Cleveland, the bank said.</Paragraph> <Paragraph position="10"> 2. Shareholders must approve the merger at general meetings of the two companies in late November.</Paragraph> <Paragraph position="11"> 3. But Mr. Rey brought about a merger in the next few years between the country's major producers.</Paragraph> <Paragraph position="12"> 4. A pharmaceutical joint venture of Johnson ~4 Johnson and Merck agreed in principle to buy the U.S. over-the-counter drug business of ICI Americas for over $450 million.</Paragraph> <Paragraph position="13"> 5. The four-year-old business is turning a small profit and the entrepreneurs are about to sign a joint venture agreement with a Moscow cooperative to export the yarn to the Soviet Union.</Paragraph> <Paragraph position="14"> Because of their semantic type, these nominals enter into an LCP which generates a set of structural templates predicted for that noun in the language. For example, the LCP in this case is the union concept, and has the following lexical structure associated with it: 5. Plant/Food alternations; e.g. fig, apple.</Paragraph> <Paragraph position="15"> 6. Process/Result diathesis; e.g. ezamination, combination. null 7. Place/People diathesis; e.g. city, New York.</Paragraph> <Paragraph position="16"> LCP: type: union \[ Const: >2x:entity(x) \] \[ Form: exist(ly) \[entity(y)\] \] \[ Agent: type:event * join(x) \] \[ Telic: nil \] This states that a union is an event which brings about one entity from two or more, and comes about by a join- null ing ewmt. The lexical structure for the nominal merger is inherited from this paradigm.</Paragraph> <Paragraph position="17"> =erger(*x*) \[ Const: ({w}>2) \[compamy(w) or firm(w)\] \] \[ Form: exists(y) \[company(y)\] \] \[ Agent: event(*x*): join(*x*,{~}) \] \[ Telie: contextual\] It is interesting to note that all synonyms for this word (or, alternatively, viewed as clustered under this concept) will share in the same LCP behavior: e.g. merging, unification, coalition, combination, consolidation, etc. With this LCP there are associated syntactic realization patterns for how the word and its arguments are realized in text. Such a paradigm is a very generic, domain independent set of schemas, which is a significant point for multi-domain and multi-task NLP applications. For the particular LCP of union, the syntactic schemas include the following:</Paragraph> <Paragraph position="19"> merger between x and y merger of the two companies merger between two companies There are several things to note here. First, such paradigmatic behavior is extremely regular for nouns in a language, and as a result, the members of such paradigms can be found using knowledge acquisition techniques from large corpora (cf. Anick and Pustejovsky (1990) for one such algorithm). Secondly, because these are very common nominal patterns for nouns such as merger, it is significant when the noun appears without all arguments explicitly expressed. For example, in (5) below, presuppositions from the lexical structure combine with discourse clues in the form of definite reference in the noun phrase (the merger) to suggest that the other partner in the merger was mentioned previously in the text.</Paragraph> <Paragraph position="20"> 5. Florida National said yesterday that it remains committed to the merger.</Paragraph> <Paragraph position="21"> Similarly powerful inferences can be made from an indefinite nominal when introduced into the discourse as in (6). Here, there is a strong presupposition that both partners in the merger are mentioned someplace in the immediately local context, e.g. as a coordinate subject, since the NP is a newly mentioned entity.</Paragraph> <Paragraph position="22"> 6. Orkem and Coates said last Wednesday that the two were considering a merger, through Orkem's British subsidiary, Orkem Coatings U.K. Lid.</Paragraph> <Paragraph position="23"> Thus, the lexical structures provide a rich set of schemas for argument mapping and semantic inferencing, as well as directed presuppositions for discontinuous semantic relations.</Paragraph> <Paragraph position="24"> One final and important note about lexical structures and paradigmatic behavior. The seed information for these structures is largely derivable from machine-readable dictionaries. For example, a dictionary definition for merger (from the Longman Dictionary of Contemporary English is &quot;the joining of 2 or more companies or firms&quot; with subject code FINANCE. This makes the task of automatic construction of a robust lexicon for NLP applications a very realizable goal (cf. Boguraev (1991) and Wilks et ai. (1991)).</Paragraph> <Paragraph position="25"> 4. Induction of Semantic Relations from</Paragraph> <Section position="1" start_page="245" end_page="246" type="sub_section"> <SectionTitle> Syntactic Forms </SectionTitle> <Paragraph position="0"> From discussion in the previous section, it should be clear that such paradigmatic information would be helpful if available. In this section, we present preliminary results indicating the feasability of learning LCPs from corpora, both tagged and untagged. Imagine being able to take the V-O pairs such as those given in section 2.1, and then applying semantic tags to the verbs which are appropriate to the role they play for that object (i.e. induction of the qualia roles for that noun). This is in fact the type of experiment reported on in Anick and Pustejovsky (1990). Here we apply a similar technique to a much larger corpus, in order to induce the agentive role for nouns. That is, the semantic predicate associated with bringing about the object.</Paragraph> <Paragraph position="1"> In this example we look at the behavior of noun phrases and the prepositional phrases that follow them. In particular, we look at the co-occurrence of nominals with between, with, and to. Table 1 shows results of the conflating verb/noun plus preposition patterns. The percentage shown indicates the ratio of the particular collocation to the key word. Mutual information (MI) statistics for the two words in collocation are also shown. What these results indicate is that induction of semantic type from conflating syntactic patterns is possible. Based on the semantic types for these prepositions, the syntactic evidence suggests that there is a symmetric relation between the arguments in the following two patterns: a. Z with y = Ax3Rz, y\[Rz(x, y) A Rz(y, x)\] b. Z between x and y = 3Rz, x, y\[Rz(x, y) ^ Rz(y, x)\] We then take these results and, for those nouns where the association ratios for N with and N between are similar, we pair them with the set of verbs governing these &quot;NP PP&quot; combinations in corpus, effectively partitioning the original V-O set into \[+agentive\] predicates and \[-agentive\] predicates. If our hypothesis is correct, we expect that verbs governing nominals collocated with a with-phrase will be mostly those predicates referring to the agentive quale of the nominal. This is because the with-phrase is unsaturated as a predicate, and acts to identify the agent of the verb as its argument. This is confirmed by our data, shown below.</Paragraph> <Paragraph position="2"> Conversely, verbs governing nominals collocating with a between-phrase will not refer to the agentive since the phrase is saturated already. Indeed, the only verb occurring in this position with any frequency is the copula be, namely with the following counts: 12 be/Y venture/0.</Paragraph> <Paragraph position="3"> Thus, week semantic types can be induced on the basis of syntactic behavior. In Pustejovsky et al (1991), we discuss how this general technique compares to somewhat different but related approaches described in Smadja (1991) and Zernik and Jacobs (1991).</Paragraph> </Section> </Section> class="xml-element"></Paper>