File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-2609_metho.xml

Size: 32,608 bytes

Last Modified: 2025-10-06 14:09:21

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2609">
  <Title>Complex Nominals Genitives Adjective OTHERS NN Adj N 's of Phrases</Title>
  <Section position="2" start_page="0" end_page="0" type="metho">
    <SectionTitle>
1 Problem description
</SectionTitle>
    <Paragraph position="0"> This paper is about the automatic labeling of semantic relations in noun phrases (NPs).</Paragraph>
    <Paragraph position="1"> The semantic relations are the underlying relations between two concepts expressed by words or phrases. We distinguish here between semantic relations and semantic roles. Semantic roles are always between verbs (or nouns derived from verbs) and other constituents (run quickly, went to the store, computer maker), whereas semantic relations can occur between any constituents, for example in complex nominals (malaria mosquito (CAUSE)), genitives (girl's mouth (PART-WHOLE)), prepositional phrases attached to nouns (man at the store (LOCATIVE)), or discourse level (The bus was late. As a result, I missed my appointment (CAUSE)). Thus, in a sense, semantic relations are more general than semantic roles and many semantic role types will appear on our list of semantic relations.</Paragraph>
    <Paragraph position="2"> The following NP level constructions are considered here (cf. the classifications provided by (Quirk et al.1985) and (Semmelmeyer and Bolander 1992)):  (1) Compound Nominals consisting of two consecutive  nouns (eg night club - a TEMPORAL relation - indicating that club functions at night), (2) Adjective Noun constructions where the adjectival modifier is derived from a noun (eg musical clock - a MAKE/PRODUCE relation), (3) Genitives (eg the door of the car - a PART-WHOLE relation), and (4) Adjective phrases (cf. (Semmelmeyer and Bolander 1992)) in which the modifier noun is expressed by a prepositional phrase which functions as an adjective (eg toy in the box - a LOCATION relation).</Paragraph>
    <Paragraph position="3"> Example: &amp;quot;Saturday's snowfall topped a one-day record in Hartford, Connecticut, with the total of 12.5 inches, the weather service said. The storm claimed its fatality Thursday, when a car which was driven by a college student skidded on an interstate overpass in the mountains of Virginia and hit a concrete barrier, police said&amp;quot;.</Paragraph>
    <Paragraph position="5"> winding down&amp;quot;, Sunday, December 7, 2003).</Paragraph>
    <Paragraph position="6"> There are several semantic relations at the noun phrase level: (1) Saturday's snowfall is a genitive encoding a TEMPORAL relation, (2) one-day record is a TOPIC noun compound indicating that record is about one-day snowing - an ellipsis here, (3) record in Hartford is an adjective phrase in a LOCATION relation, (4) total of 12.5 inches is an of-genitive that expresses MEASURE, (5) weather service is a noun compound in a TOPIC relation, (6) car which was driven by a college student encodes a THEME semantic role in an adjectival clause, (7) college student is a compound nominal in a PART-WHOLE/MEMBER-OF relation, (8) interstate overpass is a LOCATION noun compound, (9) mountains of Virginia is an of-genitive showing a PART-WHOLE/PLACE-AREA and LOCATION relation, (10) concrete barrier is a noun compound encoding PART-WHOLE/STUFF-OF.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
1.1 List of Semantic Relations
</SectionTitle>
      <Paragraph position="0"> After many iterations over a period of time we identified a set of semantic relations that cover a large majority of text semantics. Table 1 lists these relations, their definitions, examples, and some references. Most of the time, the semantic relations are encoded by lexico-syntactic patterns that are highly ambiguous. One pattern can express a number of semantic relations, its disambiguation being provided by the context or world knowledge. Often semantic relations are not disjoint or mutually exclusive, two or more appearing in the same lexical construct. This is called semantic blend (Quirk et al.1985). For example, the expression &amp;quot;Texas city&amp;quot; contains both a LOCATION as well as a PART-WHOLE relation.</Paragraph>
      <Paragraph position="1"> Other researchers have identified other sets of semantic relations (Levi 1979), (Vanderwende 1994), (Sowa 1994), (Baker, Fillmore, and Lowe 1998), (Rosario and Hearst 2001), (Kingsbury, et al. 2002), (Blaheta and Charniak 2000), (Gildea and Jurafsky 2002), (Gildea and Palmer 2002). Our list contains the most frequently used semantic relations we have observed on a large corpus. null Besides the work on semantic roles, considerable interest has been shown in the automatic interpretation of complex nominals, and especially of compound nominals. The focus here is to determine the semantic relations that hold between different concepts within the same phrase, and to analyze the meaning of these compounds. Several approaches have been proposed for empirical noun-compound interpretation, such as syntactic analysis based on statistical techniques (Lauer and Dras 1994), (Pustejovsky et al. 1993). Another popular approach focuses on the interpretation of the underlying semantics. Many researchers that followed this approach relied mostly on hand-coded rules (Finin 1980), (Vanderwende 1994). More recently, (Rosario and Hearst 2001), (Rosario, Hearst, and Fillmore 2002), (Lapata 2002) have proposed automatic methods that analyze and detect noun compounds relations from text. (Rosario and Hearst 2001) focused on the medical domain making use of a lexical ontology and standard machine learning techniques. null</Paragraph>
    </Section>
  </Section>
  <Section position="3" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2 Approach
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.1 Basic Approach
</SectionTitle>
      <Paragraph position="0"> We approach the problem top-down, namely identify and study first the characteristics or feature vectors of each noun phrase linguistic pattern, then develop models for their semantic classification. This is in contrast to our prior approach ( (Girju, Badulescu, and Moldovan 2003a)) when we studied one relation at a time, and learned constraints to identify only that relation. We study the distribution of the semantic relations across different NP patterns and analyze the similarities and differences among resulting semantic spaces. We define a semantic space as the set of semantic relations an NP construction can encode. We aim at uncovering the general aspects that govern the NP semantics, and thus delineate the semantic space within clusters of semantic relations.</Paragraph>
      <Paragraph position="1"> This process has the advantage of reducing the annotation effort, a time consuming activity. Instead of manually annotating a corpus for each semantic relation, we do it only for each syntactic pattern and get a clear view of its semantic space. This syntactico-semantic approach allows us to explore various NP semantic classification models in a unified way.</Paragraph>
      <Paragraph position="2"> This approach stemmed from our desire to answer questions such as:  1. What influences the semantic interpretation of various linguistic constructions? 2. Is there only one interpretation system/model that works best for all types of expressions at all syntactic levels? and 3. What parameters govern the models capable of semantic interpretation of various syntactic constructions?</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 Semantic Relations at NP level
</SectionTitle>
      <Paragraph position="0"> It is well understood and agreed in linguistics that concepts can be represented in many ways using various constructions at different syntactic levels. This is in part why we decided to take the syntactico-semantic approach that analyzes semantic relations at different syntactic levels of representation. In this paper we focus only on the behavior of semantic relations at NP level. A thorough understanding of the syntactic and semantic characteristics of NPs provides valuable insights into defining the most representative feature vectors that ultimately drive the discriminating learning models.</Paragraph>
      <Paragraph position="1"> Complex Nominals Levi (Levi 1979) defines complex nominals (CNs) as expressions that have a head noun preceded by one or more modifying nouns, or by adjectives derived from nouns (usually called denominal adjectives). Most importantly for us, each sequence of nouns, or possibly adjectives and nouns, has a particular meaning as a whole carrying an implicit semantic relation; for example, &amp;quot;spoon handle&amp;quot; (PART-WHOLE) or &amp;quot;musical clock&amp;quot; (MAKE/PRODUCE).</Paragraph>
      <Paragraph position="2"> CNs have been studied intensively in linguistics, psycho-linguistics, philosophy, and computational linguistics for a long time. The semantic interpretation of CNs proves to be very difficult for a number of reasons. (1) Sometimes the meaning changes with the head (eg &amp;quot;musical clock&amp;quot; MAKE/PRODUCE, &amp;quot;musical creation&amp;quot; THEME), other times with the modifier (eg &amp;quot;GM car&amp;quot; MAKE/PRODUCE, &amp;quot;family car&amp;quot; POSSESSION). (2) CNs' interpretation is knowledge intensive and can be idiosyncratic. For example, in order to interpret correctly &amp;quot;GM car&amp;quot; we have to know that GM is a car-producing company. (3) There can be many possible semantic relations between a given pair of word constituents. For example, &amp;quot;USA city&amp;quot; can be regarded as a LOCATION as well as a PART-WHOLE relation. (4) Interpretation of CNs can be highly context-dependent. For example, &amp;quot;apple juice seat&amp;quot; can be defined as &amp;quot;seat with apple juice on the table in front of it&amp;quot; (cf. (Downing 1977)).</Paragraph>
      <Paragraph position="3">  4 AGENT the doer or instigator of the action denoted by the predicate; (employee protest; parental approval; The king banished the general.); (Baker, Fillmore, and Lowe 1998) 5 TEMPORAL time associated with an event; (5-o'clock tea; winter training; the store opens at 9 am), includes DURATION (Navigli and Velardi 2003), 6 DEPICTION- an event/action/entity depicting another event/action/entity; (A picture of my niece.), DEPICTED 7 PART-WHOLE an entity/event/state is part of another entity/event/state (door knob; door of the car), (MERONYMY) (Levi 1979), (Dolan et al. 1993), 8 HYPERNYMY an entity/event/state is a subclass of another; (daisy flower; Virginia state; large company, such as Microsoft) (IS-A) (Levi 1979), (Dolan et al. 1993) 9 ENTAIL an event/state is a logical consequence of another; (snoring entails sleeping) 10 CAUSE an event/state makes another event/state to take place; (malaria mosquitoes; to die of hunger; The earthquake generated a Tsunami), (Levi 1979) 11 MAKE/PRODUCE an animated entity creates or manufactures another entity; (honey bees; nuclear power plant; GM makes cars) (Levi 1979) 12 INSTRUMENT an entity used in an event/action as instrument; (pump drainage; the hammer broke the box) (Levi 1979) 13 LOCATION/SPACE spatial relation between two entities or between an event and an entity; includes DIRECTION; (field mouse; street show; I left the keys in the car), (Levi 1979), (Dolan et al. 1993) 14 PURPOSE a state/action intended to result from a another state/event; (migraine drug; wine glass; rescue mission; He was quiet in order not to disturb her.) (Navigli and Velardi 2003) 15 SOURCE/FROM place where an entity comes from; (olive oil; I got it from China) (Levi 1979) 16 TOPIC an object is a topic of another object; (weather report; construction plan; article about terrorism); (Rosario and Hearst 2001) 17 MANNER a way in which an event is performed or takes place; (hard-working immigrants; enjoy immensely; he died of cancer); (Blaheta and Charniak 2000) 18 MEANS the means by which an event is performed or takes place; (bus service; I go to school by bus.) (Quirk et al.1985) 19 ACCOMPANIMENT one/more entities accompanying another entity involved in an event; (meeting with friends; She came with us) (Quirk et al.1985) 20 EXPERIENCER an animated entity experiencing a state/feeling; (Mary was in a state of panic.); (Sowa 1994) 21 RECIPIENT an animated entity for which an event is performed; (The eggs are for you) ; includes BENEFICIARY; (Sowa 1994) 22 FREQUENCY number of occurrences of an event; (bi-annual meeting; I take the bus every day); (Sowa 1994) 23 INFLUENCE an entity/event that affects other entity/event; (drug-affected families; The war has an impact on the economy.); 24 ASSOCIATED WITH an entity/event/state that is in an (undefined) relation with another entity/event/state; (Jazz-associated company;) 25 MEASURE an entity expressing quantity of another entity/event; (cup of sugar; 70-km distance; centennial rite; The jacket cost $60.) 26 SYNONYMY a word/concept that means the same or nearly the same as another word/concept; (NAME) (Marry is called Minnie); (Sowa 1994) 27 ANTONYMY a word/concept that is the opposite of another word/concept; (empty is the opposite of full); (Sowa 1994) 28 PROBABILITY OF the quality/state of being probable; likelihood EXISTENCE (There is little chance of rain tonight); (Sowa 1994) 29 POSSIBILITY the state/condition of being possible; (I might go to Opera tonight); (Sowa 1994) 30 CERTAINTY the state/condition of being certain or without doubt; (He definitely left the house this morning); 31 THEME an entity that is changed/involved by the action/event denoted by the predicate; (music lover; John opened the door.); (Sowa 1994) 32 RESULT the inanimate result of the action/event denoted by the predicate; includes EFFECT and PRODUCT. (combustion gases; I finished the task completely.); (Sowa 1994) 33 STIMULUS stimulus of the action or event denoted by the predicate (We saw [the painting]. I sensed [the eagerness] in him. I can see [that you are feeling great].) (Baker, Fillmore, and Lowe 1998) 34 EXTENT the change of status on a scale (by a percentage or by a value) of some entity; (The price of oil increased [ten percent]. Oil's price increased by [ten percent]. ); (Blaheta and Charniak 2000) 35 PREDICATE expresses the property associated with the subject or the object through the verb; (He feels [sleepy]. They elected him [treasurer]. ) (Blaheta and Charniak 2000)  and references.</Paragraph>
      <Paragraph position="4"> is considered problematic by linguists because they involve an implicit relation that seems to allow for a large variety of relational interpretations; for example: &amp;quot;John's car&amp;quot;-POSSESSOR-POSSESSEE, &amp;quot;Mary's brother&amp;quot;-KINSHIP, &amp;quot;last year's exhibition&amp;quot;-TEMPORAL, &amp;quot;a picture of my nice&amp;quot;-DEPICTION-DEPICTED, and &amp;quot;the desert's oasis&amp;quot;-PART-WHOLE/PLACE-AREA. A characteristic of these constructions is that they are very productive, as the construction can be given various interpretations depending on the context. One such example is &amp;quot;Kate's book&amp;quot; that can mean the book Kate owns, the book Kate wrote, or the book Kate is very fond of.</Paragraph>
      <Paragraph position="5"> Thus, the features that contribute to the semantic interpretation of genitives are: the nouns' semantic classes, the type of genitives, discourse and pragmatic information. null Adjective Phrases are prepositional phrases attached to nouns acting as adjectives (cf. (Semmelmeyer and Bolander 1992)). Prepositions play an important role both syntactically and semantically. Semantically speaking, prepositional constructions can encode various semantic relations, their interpretations being provided most of the time by the underlying context. For instance, the preposition &amp;quot;with&amp;quot; can encode different semantic re- null lations: (1) It was the girl with blue eyes (MERONYMY), (2) The baby with the red ribbon is cute (POSSESSION), (3) The woman with triplets received a lot of attention  (KINSHIP).</Paragraph>
      <Paragraph position="6"> The conclusion for us is that in addition to the nouns semantic classes, the preposition and the context play important roles here.</Paragraph>
      <Paragraph position="7"> In order to focus our research, we will concentrate for now only on noun - noun or adjective - noun compositional constructions at NP level, ie those whose meaning can be derived from the meaning of the constituent nouns (&amp;quot;door knob&amp;quot;, &amp;quot;cup of wine&amp;quot;). We don't consider metaphorical names (eg, &amp;quot;ladyfinger&amp;quot;), metonymies (eg, &amp;quot;Vietnam veteran&amp;quot;), proper names (eg, &amp;quot;John Doe&amp;quot;), and NPs with coordinate structures in which neither noun is the head (eg, &amp;quot;player-coach&amp;quot;). However, we check if the constructions are non-compositional (lexicalized) (the meaning is a matter of convention; e.g., &amp;quot;soap opera&amp;quot;, &amp;quot;sea lion&amp;quot;), but only for statistical purposes. Fortunately, some of these can be identified with the help of lexicons.</Paragraph>
    </Section>
    <Section position="3" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.3 Corpus Analysis at NP level
</SectionTitle>
      <Paragraph position="0"> In order to provide a unified approach for the detection of semantic relations at different NP levels, we analyzed the syntactic and semantic behavior of these constructions on a large open-domain corpora of examples. Our intention is to answer questions like: (1) What are the semantic relations encoded by the NP-level constructions?, (2) What is their distribution on a large corpus?, (3) Is there a common subset of semantic relations that can be fully paraphrased by all types of NP constructions?, (4) How many NPs are lexicalized? The data We have assembled a corpus from two sources: Wall Street Journal articles from TREC-9, and eXtended WordNet glosses (XWN) (http://xwn.hlt.utdallas.edu).</Paragraph>
      <Paragraph position="1"> We used XWN 2.0 since all its glosses are syntactically parsed and their words semantically disambiguated which saved us considerable amount of time. Table 2 shows for each syntactic category the number of randomly selected sentences from each corpus, the number of instances found in these sentences, and finally the number of instances that our group managed to annotate by hand. The annotation of each example consisted of specifying its feature vector and the most appropriate semantic relation from those listed in Table 1.</Paragraph>
    </Section>
    <Section position="4" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Inter-annotator Agreement
</SectionTitle>
      <Paragraph position="0"> The annotators, four PhD students in Computational Semantics worked in groups of two, each group focusing on one half of the corpora to annotate. Noun - noun (adjective - noun, respectively) sequences of words were extracted using the Lauer heuristic (Lauer 1995) which looks for consecutive pairs of nouns that are neither preceded nor succeeded by a noun after each sentence was syntactically parsed with Charniak parser (Charniak 2001) (for XWN we used the gold parse trees). Moreover, they were provided with the sentence in which the pairs occurred along with their corresponding WordNet senses. Whenever the annotators found an example encoding a semantic relation other than those provided or they didn't know what interpretation to give, they had to tag it as &amp;quot;OTHERS&amp;quot;. Besides the type of relation, the annotators were asked to provide information about the order of the modifier and the head nouns in the syntactic constructions if applicable. For instance, in &amp;quot;owner of car&amp;quot;-POSSESSION the possessor owner is followed by the possessee car, while in &amp;quot;car of John&amp;quot;-POSSESSION/R the order is reversed. On average, 30% of the training examples had the nouns in reverse order.</Paragraph>
      <Paragraph position="1"> Most of the time, one instance was tagged with one semantic relation, but there were also situations in which an example could belong to more than one relation in the same context. For example, the genitive &amp;quot;city of USA&amp;quot; was tagged as a PART-WHOLE/PLACE-AREA relation and as a LOCATION relation. Overall, there were 608 such cases in the training corpora. Moreover, the annotators were asked to indicate if the instance was lexicalized or not. Also, the judges tagged the NP nouns in the training corpus with their corresponding WordNet senses.</Paragraph>
      <Paragraph position="2"> The annotators' agreement was measured using the Kappa statistics, one of the most frequently used measure of inter-annotator agreement for classification tasks:</Paragraph>
      <Paragraph position="4"> , where a23a25a24a27a26a29a28a31a30 is the proportion of times the raters agree and a23a25a24a27a26a29a32a33a30 is the probability of agreement by chance. The K coefficient is 1 if there is a total agreement among the annotators, and 0 if there is no agreement other than that expected to occur by chance.</Paragraph>
      <Paragraph position="5"> Table 3 shows the semantic relations inter-annotator agreement on both training and test corpora for each NP construction. For each construction, the corpus was splint into 80/20 training/testing ratio after agreement.</Paragraph>
      <Paragraph position="6"> We computed the K coefficient only for those instances tagged with one of the 35 semantic relations. For each pattern, we also computed the number of pairs that were tagged with OTHERS by both annotators, over the number of examples classified in this category by at least one of the judges, averaged by the number of patterns considered. null The K coefficient shows a fair to good level of agreement for the training and testing data on the set of 35 relations, taking into consideration the task difficulty. This can be explained by the instructions the annotators received prior to annotation and by their expertise in lexical semantics. There were many heated discussions as well.</Paragraph>
      <Paragraph position="7">  and test corpora. For the semantic blend examples, the agreement was done on one of the relations only.</Paragraph>
    </Section>
    <Section position="5" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.4 Distribution of Semantic Relations
</SectionTitle>
      <Paragraph position="0"> Even noun phrase constructions are very productive allowing for a large number of possible interpretations, Table 4 shows that a relatively small set of 35 semantic relations covers a significant part of the semantic distribution of these constructions on a large open-domain corpus. Moreover, the distribution of these relations is dependent on the type of NP construction, each type encoding a particular subset. For example, in the case of of-genitives, there were 21 relations found from the total of 35 relations considered. The most frequently occurring relations were PART-WHOLE, ATTRIBUTE-HOLDER, POSSESSION, LOCATION, SOURCE, TOPIC, and THEME.</Paragraph>
      <Paragraph position="1"> By comparing the subsets of semantic relations in each column we can notice that these semantic spaces are not identical, proving our initial intuition that the NP constructions cannot be alternative ways of packing the same information. Table 4 also shows that there is a subset of semantic relations that can be fully encoded by all types of NP constructions. The statistics about the lexicalized examples are as follows: N-N (30.01%), Adj-N (0%), s-genitive (0%), of-genitive (0%), adjective phrase (1%). From the 30.01% lexicalized noun compounds , 18% were proper names.</Paragraph>
      <Paragraph position="2"> This simple analysis leads to the important conclusion that the NP constructions must be treated separately as their semantic content is different. This observation is also partially consistent with other recent work in linguistics and computational linguistics on the grammatical variation of the English genitives, noun compounds, and adjective phrases.</Paragraph>
      <Paragraph position="3"> We can draw from here the following conclusions:  1. Not all semantic relations can be encoded by all NP syntactic constructions.</Paragraph>
      <Paragraph position="4"> 2. There are semantic relations that have preferences over particular syntactic constructions.</Paragraph>
    </Section>
    <Section position="6" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.5 Models
</SectionTitle>
      <Paragraph position="0"> Given each NP syntactic construction considered, the goal is to develop a procedure for the automatic labeling of the semantic relations they encode. The semantic relation derives from the lexical, syntactic, semantic and contextual features of each NP construction.</Paragraph>
      <Paragraph position="1"> Semantic classification of syntactic patterns in general can be formulated as a learning problem, and thus benefit from the theoretical foundation and experience gained with various learning paradigms. This is a multi-class classification problem since the output can be one of the semantic relations in the set. We cast this as a supervised learning problem where input/ output pairs are available as training data.</Paragraph>
      <Paragraph position="2"> An important first step is to map the characteristics of each NP construction (usually not numerical) into feature vectors. Let's define witha0a2a1 the feature vector of an instancea3 and leta4 be the space of all instances; iea0a1a6a5 a4 . The multi-class classification is performed by a function that maps the feature spacea4 into a semantic space</Paragraph>
      <Paragraph position="4"> from Table 1, ie a24a14a13 a5 a7 .</Paragraph>
      <Paragraph position="5"> Let a15 be the training set of examples or instances  number of examplesa0 each accompanied by its semantic relation label a24 . The problem is to decide which semantic relation a24 to assign to a new, unseen examplea0a26a35a34 a20 . In order to classify a given set of examples (members ofa4 ), one needs some kind of measure of the similarity (or the difference) between any two given members ofa4 . Most of the times it is difficult to explicitly define this function, sincea4 can contain features with numerical as well as non-numerical values.</Paragraph>
      <Paragraph position="6"> Note that the features, thus spacea4 , vary from an NP pattern to another and the classification function will be pattern dependent. The novelty of this learning problem is the feature spacea4 and the nature of the discriminating  An essential aspect of our approach below is the word sense disambiguation (WSD) of the content words (nouns, verbs, adjectives and adverbs). Using a state-of-the-art open-text WSD system, each word is mapped into its corresponding WordNet 2.0 sense. When disambiguating each word, the WSD algorithm takes into account the surrounding words, and this is one important way through which context gets to play a role in the semantic classification of NPs.</Paragraph>
      <Paragraph position="7"> So far, we have identified and experimented with the following NP features:  1. Semantic class of head noun specifies the WordNet sense (synset) of the head noun and implicitly points to all its hypernyms. It is extracted automatically via a word sense disambiguation module. The NP semantics is influenced heavily by the meaning of the noun constituents. Example: &amp;quot;car manufacturer&amp;quot; is a kind of manufacturer that MAKES/PRODUCES cars.</Paragraph>
      <Paragraph position="8"> 2. Semantic class of modifier noun  specifies the WordNet synset of the modifier noun. In case the modifier is a denominal adjective, we take the synset of the noun from which the adjective is derived. Example: &amp;quot;musical clock&amp;quot; - MAKE/PRODUCE, and &amp;quot;electric clock&amp;quot;- INSTRUMENT.</Paragraph>
      <Paragraph position="9">  Several learning models can be used to provide the discriminating function  . So far we have experimented with three models: (1) semantic scattering, (2) decision trees, and (3) naive Bayes. The first is described below, the other two are fairly well known from the machine learning literature.</Paragraph>
      <Paragraph position="10"> Semantic Scattering. This is a new model developed by us particularly useful for the classification of compound nominals a2 a20 a2 a19 without nominalization. The semantic relation in this case derives from the semantics of the two noun concepts participating in these constructions as well as the surrounding context.</Paragraph>
      <Paragraph position="11"> Model Formulation. Let us define witha7a4a3a6a5 a1a8a7  a1a13 a30 . One way of approximating the feature vectora8 a1a13 is to perform a semantic generalization, by replacing the synsets with their most general hypernyms, followed by a series of specializations for the purpose of eliminating ambiguities in the training data. There are 9 noun hierarchies, thus only 81 possible combinations at the most general level. Table 5 shows a row of the probability matrix a23 a26 a24a1a0  which there is more than one relation, is scattered into other subclasses through an iterative process till there is only one semantic relation per line. This can be achieved by specializing the feature pair's semantic classes with their immediate WordNet hyponyms. The iterative process stops when new training data does not bring any improvements (see Table 6).</Paragraph>
      <Paragraph position="12"> 2.5.4 Overview of the Preliminary Results The f-measure results obtained so far are summarized in Table 7. Overall, these results are very encouraging given the complexity of the problem.</Paragraph>
      <Paragraph position="13">  An important way of improving the performance of a system is to do a detailed error analysis of the results. We have analyzed the sources of errors in each case and found out that most of them are due to (in decreasing order of importance): (1) errors in automatic sense disambiguation, (2) missing combinations of features that occur in testing but not in the training data, (3) levels of specialization are too high, (4) errors caused by metonymy, (6) errors in the modifier-head order, and others. These errors could be substantially decreased with more research effort.</Paragraph>
      <Paragraph position="14"> A further analysis of the data led us to consider a different criterion of classification that splits the examples into nominalizations and non-nominalizations. The reason is that nominalization noun phrases seem to call for a different set of learning features than the non-nominalization noun phrases, taking advantage of the underlying verb-argument structure. Details about this approach are provided in (Girju et al. 2004)).</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Applications
</SectionTitle>
    <Paragraph position="0"> Semantic relations occur with high frequency in open text, and thus, their discovery is paramount for many applications. One important application is Question Answering. A powerful method of answering more difficult questions is to associate to each question the semantic relation that reflects the meaning of that question and then search for that semantic relation over the candidates of semantically tagged paragraphs. Here is an example.</Paragraph>
    <Paragraph position="1"> Q. Where have nuclear incidents occurred? From the question stem word where, we know the question asks for a LOCATION which is found in the complex nominal &amp;quot;Three Mile Island&amp;quot;-LOCATION of the sentence &amp;quot;The Three Mile Island nuclear incident caused a DOE policy crisis&amp;quot;, leading to the correct answer &amp;quot;Three Mile Island&amp;quot;. Q. What did the factory in Howell Michigan make? The verb make tells us to look for a MAKE/PRODUCE relation which is found in the complex nominal &amp;quot;car factory&amp;quot;-MAKE/PRODUCE of the text: &amp;quot;The car factory in Howell Michigan closed on Dec 22, 1991&amp;quot; which leads to answer car.</Paragraph>
    <Paragraph position="2"> Another important application is building semantically rich ontologies. Last but not least, the discovery of text semantic relations can improve syntactic parsing and even WSD which in turn affects directly the accuracy of other NLP modules and applications. We consider these applications for future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML