File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/w06-2407_intro.xml
Size: 4,114 bytes
Last Modified: 2025-10-06 14:04:05
<?xml version="1.0" standalone="yes"?> <Paper uid="W06-2407"> <Title>Extending corpus-based identification of light verb constructions using a supervised learning framework</Title> <Section position="2" start_page="0" end_page="49" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Many applications in natural language processing rely on the relationships between words in a document. Verbs play a central role in many such tasks; for example, the assignment of semantic roles to noun phrases in a sentence heavily depends on the verb that link the noun phrases together (as in &quot;Pierre Vinken/SUBJ, will join/PRED, the board/OBJ&quot;).</Paragraph> <Paragraph position="1"> However, verb processing is difficult because of many phenomena, such as normalization of actions, verb particle constructions and light verb constructions. Applications that process verbs must handle these cases effectively. We focus on the identification of light verb constructions (also known as support verb constructions) in English, as such constructions play a prominent and productive role in many other languages (Butt and Geuder, 2001; Miyamoto, 2000). Although the exact definition of a LVC varies in the literature, we use the following operational definition: A light verb construction (LVC) is a verb-complement pair in which the verb has little lexical meaning (is &quot;light&quot;) and much of the semantic content of the construction is obtained from the complement. null Examples of LVCs in English include &quot;give a speech&quot;, &quot;make good (on)&quot; and &quot;take (NP) into account&quot;. In the case in which the complement is a noun, it is often a deverbal noun and, as such, can usually be paraphrased using the object's root verb form without (much) loss in its meaning (e.g., take a walk - walk, make a decision - decide, give a speech - speak).</Paragraph> <Paragraph position="2"> We propose a corpus-based approach to determine whether a verb-object pair is a LVC.</Paragraph> <Paragraph position="3"> Note that we limit the scope of LVC detection to LVCs consisting of verbs with noun complements.</Paragraph> <Paragraph position="4"> Specifically, we extend previous work done by others by examining how the local context of the candidate construction and the corpus-wide frequency of related words to the construction play an influence on the lightness of the verb.</Paragraph> <Paragraph position="5"> A second contribution is to integrate our new features with previously reported ones under a machine learning framework. This framework optimizes the weights for these measures automatically against a training corpus in supervised learning, and attests to the significant modeling im- null provements of our features on our corpus. Our corpus-based evaluation shows that the combination of previous work and our new features improves LVC detection significantly over previous work.</Paragraph> <Paragraph position="6"> An advantage gained by adopting a machine learning framework is that it can be easily adapted to other languages that also exhibit light verbs. While we perform evaluations on English, light verbs exist in most other languages. In some of these languages, such as Persian, most actions are expressed as LVCs rather than single-word verbs (Butt, 2003). As such, there is currently a unmet demand for developing an adaptable framework for LVC detection that applies across languages. We believe the features proposed in this paper would also be effective in identifying light verbs in other languages.</Paragraph> <Paragraph position="7"> We first review previous corpus-based approaches to LVC detection in Section 2. In Section 3, we show how we extend the use of mutual information and employ context modeling as features for improved LVC detection. We next describe our corpus processing and how we compiled our gold standard judgments used for supervised machine learning. In Section 4, we evaluate several feature combinations before concluding the paper.</Paragraph> </Section> class="xml-element"></Paper>