File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2610_intro.xml
Size: 4,627 bytes
Last Modified: 2025-10-06 14:02:44
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2610"> <Title>Support Vector Machines Applied to the Classification of Semantic Relations in Nominalized Noun Phrases</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"/> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.1 Problem description </SectionTitle> <Paragraph position="0"> The automatic identification of semantic relations in text has become increasingly important in Information Extraction, Question Answering, Summarization, Text Understanding, and other NLP applications. This paper discusses the automatic labeling of semantic relations in nominalized noun phrases (NPs) using a support vector machines learning algorithm.</Paragraph> <Paragraph position="1"> Based on the classification provided by the New Webster's Grammar Guide (Semmelmeyer and Bolander 1992) and our observations of noun phrase patterns on large text collections, the most frequently occurring NP level constructions are: (1) Compound Nominals consisting of two consecutive nouns (eg pump drainage - an IN-STRUMENT relation), (2) Adjective Noun constructions where the adjectival modifier is derived from a noun (eg parental refusal - AGENT), (3) Genitives (eg tone of conversation - a PROPERTY relation), (4) Adjective phrases in which the modifier noun is expressed by a prepositional phrase which functions as an adjective (eg amusement in the park - a LOCATION relation), and (5) Adjective clauses where the head noun is modified by a relative clause (eg the man who was driving the car - an AGENT relation between man and driving).</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.2 Previous work on the discovery of semantic relations </SectionTitle> <Paragraph position="0"> The development of large semantically annotated corpora, such as Penn Treebank2 and, more recently, Prop-Bank (Kingsbury, et al. 2002), as well as semantic knowledge bases, such as FrameNet (Baker, Fillmore, and Lowe 1998), have stimulated a high interest in the automatic acquisition of semantic relations, and especially of semantic roles. In the last few years, many researchers (Blaheta and Charniak 2000), (Gildea and Jurafsky 2002), (Gildea and Palmer 2002), (Pradhan et al. 2003) have focused on the automatic prediction of semantic roles using statistical techniques. These statistical techniques operate on the output of probabilistic parsers and take advantage of the characteristic features of the semantic roles that are then employed in a learning algorithm. null While these systems focus on verb-argument semantic relations, called semantic roles, in this paper we investigate predicate-argument semantic relations in nominalized noun phrases and present a method for their automatic detection in open-text.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 1.3 Approach </SectionTitle> <Paragraph position="0"> We approach the problem top-down, namely identify and study first the characteristics or feature vectors of each noun phrase linguistic pattern and then develop models for their semantic classification. The distribution of the semantic relations is studied across different NP patterns and the similarities and differences among resulting semantic spaces are analyzed. A thorough understanding of the syntactic and semantic characteristics of NPs provides valuable insights into defining the most representative feature vectors that ultimately drive the discriminating learning models.</Paragraph> <Paragraph position="1"> An important characteristic of this work is that it relies heavily on state-of-the-art natural language processing and machine learning methods. Prior to the discovery of semantic relations, the text is syntactically parsed with Charniak's parser (Charniak 2001) and words are semantically disambiguated and mapped into their appropriate WordNet senses. The word sense disambiguation is done manually for training and automatically for testing with a state-of-the-art WSD module, an improved version of a system with which we have participated successfully in Senseval 2 and which has an accuracy of 81% when disambiguating nouns in open-domain. The discovery of semantic relations is based on learning lexical, syntactic, semantic and contextual constraints that effectively identify the most probable relation for each NP construction considered.</Paragraph> </Section> </Section> class="xml-element"></Paper>