XML Viewer - w06-1207

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1207_metho.xml
Size: 18,209 bytes
Last Modified: 2025-10-06 14:10:41
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1207">
  <Title>Classifying Particle Semantics in English Verb-Particle Constructions</Title>
  <Section position="4" start_page="45" end_page="46" type="metho">
    <SectionTitle>
2 Features Used in Classification
</SectionTitle>
    <Paragraph position="0"> The following subsections describe the two sets of features we investigated. The linguistic features are motivated by specific semantic and syntactic properties of verbs and VPCs, while the word co-occurrence features are more general.</Paragraph>
    <Section position="1" start_page="45" end_page="45" type="sub_section">
      <SectionTitle>
2.1 Linguistically Motivated Features
</SectionTitle>
      <Paragraph position="0"> We hypothesize that the semantic contribution of a particle when combined with a given verb is related to the semantics of that verb. That is, the particle contributes the same meaning when combining with any of a semantic class of verbs.1 For example, the VPCs drink up, eat up and gobble up all draw on the completion sense of up; the VPCs puff out, spread out and stretch out all draw on the extension sense of out. The prevalence of these patterns suggests that features which have been shown to be effective for the semantic classification of verbs may be useful for our task.</Paragraph>
      <Paragraph position="1"> We adopt simple syntactic &amp;quot;slot&amp;quot; features which have been successfully used in automatic semantic classification of verbs (Joanis and Stevenson, 1Villavicencio (2005) observes that verbs from a semantic class will form VPCs with similar sets of particles. Here we are hypothesizing further that VPCs formed from verbs of a semantic class draw on the same meaning of the given particle.</Paragraph>
      <Paragraph position="2"> 2003). The features are motivated by the fact that semantic properties of a verb are reflected in the syntactic expression of the participants in the event the verb describes. The slot features encode the relative frequencies of the syntactic slots--subject, direct and indirect object, object of a preposition--that the arguments and adjuncts of a verb appear in. We calculate the slot features over three contexts: all uses of a verb; all uses of the verb in a VPC with the target particle (up in our experiments); all uses of the verb in a VPC with any of a set of high frequency particles (to capture its semantics when used in VPCs in general).</Paragraph>
      <Paragraph position="3">  Two types of features are motivated by properties specific to the semantics and syntax of particles and VPCs. First, Wurmbrand (2000) notes that compositional particle verbs in German (a somewhat related phenomenon to English VPCs) allow the replacement of their particle with semantically similar particles. We extend this idea, hypothesizing that when a verb combines with a particle such as up in a particular sense, the pattern of usage of that verb in VPCs using all other particles may be indicative of the sense of the target particle (in this case up) when combined with that verb. To reflect this observation, we count the relative frequency of any occurrence of the verb used in a VPC with each of a set of high frequency particles. null Second, one of the striking syntactic properties of VPCs is that they can often occur in either the joined configuration (2a) or the split configuration (2b): (2a) Drink up your milk! He walked out quickly.</Paragraph>
      <Paragraph position="4"> (2b) Drink your milk up! He walked quickly out.</Paragraph>
      <Paragraph position="5"> Bolinger (1971) notes that the joined construction may be more favoured when the sense of the particle is not literal. To encode this, we calculate the relative frequency of the verb co-occurring with the particle up with each of a0 -a1 words between the verb and up, reflecting varying degrees of verb-particle separation.</Paragraph>
    </Section>
    <Section position="2" start_page="45" end_page="46" type="sub_section">
      <SectionTitle>
2.2 Word Co-occurrence Features
</SectionTitle>
      <Paragraph position="0"> We also explore the use of general context features, in the form of word co-occurrence frequency vectors, which have been used in numerous approaches to determining the semantics of a target  word. Note, however, that unlike the task of word sense disambiguation, which examines the context of a target word token to be disambiguated, here we are looking at aggregate contexts across all instances of a target VPC, in order to perform type classification.</Paragraph>
      <Paragraph position="1"> We adopt very simple word co-occurrence features (WCFs), calculated as the frequency of any (non-stoplist) word within a certain window left and right of the target. We noted above that the target particle semantics is related both to the semantics of the verb it co-occurs with, and to the occurrence of the verb across VPCs with different particles. Thus we not only calculate the WCFs of the target VPC (a given verb used with the particle up), but also the WCFs of the verb itself, and the verb used in a VPC with any of the high frequency particles. These WCFs give us a very general means for determining semantics, whose performance we can contrast with our linguistic features. null</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="46" end_page="47" type="metho">
    <SectionTitle>
3 Particle Semantics and Sense Classes
</SectionTitle>
    <Paragraph position="0"> We give some brief background on cognitive grammar and its relation to particle semantics, and then turn to the semantic analysis of up that we draw on as the basis for the sense classes in our experiments.</Paragraph>
    <Section position="1" start_page="46" end_page="46" type="sub_section">
      <SectionTitle>
3.1 Cognitive Grammar and Schemas
</SectionTitle>
      <Paragraph position="0"> Some linguistic studies consider many VPCs to be idiomatic, but do not give a detailed account of the semantic similarities between them (Bolinger, 1971; Fraser, 1976; Jackendoff, 2002). In contrast, work in cognitive linguistics has claimed that many so-called idiomatic expressions draw on the compositional contribution of (at least some of) their components (Lindner, 1981; Morgan, 1997; Hampe, 2000). In cognitive grammar (Langacker, 1987), non-spatial concepts are represented as spatial relations. Key terms from this framework are: Trajector (TR) The object which is conceptually foregrounded.</Paragraph>
      <Paragraph position="1"> Landmark (LM) The object against which the TR is foregrounded.</Paragraph>
      <Paragraph position="2"> Schema An abstract conceptualization of an experience. Here we focus on schemas depicting a TR, LM and their relationship in both the initial configuration and the final configuration communicated by some expression.</Paragraph>
      <Paragraph position="3">  The semantic contribution of a particle in a VPC corresponds to a schema. For example, in sentence (3), the TR is the balloon and the LM is the ground the balloon is moving away from.</Paragraph>
      <Paragraph position="4"> (3) The balloon floated up.</Paragraph>
      <Paragraph position="5"> The schema describing the semantic contribution of the particle in the above sentence is shown in Figure 1, which illustrates the relationship between the TR and LM in the initial and final configurations. null</Paragraph>
    </Section>
    <Section position="2" start_page="46" end_page="47" type="sub_section">
      <SectionTitle>
3.2 The Senses of up
</SectionTitle>
      <Paragraph position="0"> Lindner (1981) identifies a set of schemas for each of the particles up and out, and groups VPCs according to which schema is contributed by their particle. Here we describe the four senses of up identified by Lindner.</Paragraph>
      <Paragraph position="1">  In this schema (shown above in Figure 1), the TR moves away from the LM in the direction of increase along a vertically oriented axis. This includes prototypical spatial upward movement such as that in sentence (3), as well as upward movement along an abstract vertical axis as in sentence (4).</Paragraph>
      <Paragraph position="2"> (4) The price of gas jumped up.</Paragraph>
      <Paragraph position="3"> In Lindner's analysis, this sense also includes extensions of upward movement where a vertical path or posture is still salient. Note that in some of these senses, the notion of verticality is metaphorical; the contribution of such senses to a VPC may not be considered compositional in a traditional analysis. Some of the most common sense extensions are given below, with a brief justification as to why verticality is still salient.</Paragraph>
      <Paragraph position="4">  Up as a path into perceptual field. Spatially high objects are generally easier to perceive. Examples: show up, spring up, whip up.</Paragraph>
      <Paragraph position="5"> Up as a path into mental field. Here up encodes a path for mental as opposed to physical objects. Examples: dream up, dredge up, think up.</Paragraph>
      <Paragraph position="6"> Up as a path into a state of activity. Activity is prototypically associated with an erect posture. Examples: get up, set up, start up.</Paragraph>
      <Paragraph position="7">  Here the TR approaches a goal LM; movement is not necessarily vertical (see Figure 2). Prototypical examples are walk up and march up. This category also includes extensions into the social domain (kiss up and suck up), as well as extensions into the domain of time (come up and move up), as in:  (5a) The intern kissed up to his boss.</Paragraph>
      <Paragraph position="8"> (5b) The deadline is coming up quickly.</Paragraph>
      <Paragraph position="9"> 3.2.3 Completive up (Cmpl-up)  Cmpl-up is a sub-sense of Goal-up in which the goal represents an action being done to completion. This sense shares its schema with Goal-up (Figure 2), but it is considered as a separate sense since it corresponds to uses of up as an aspectual marker. Examples of Cmpl-up are: clean up, drink up, eat up, finish up and study up.</Paragraph>
      <Paragraph position="10">  Reflexive up is a sub-sense of Goal-up in which the sub-parts of the TR are approaching each other. The schema for Refl-up is shown in Figure 3; it is unique in that the TR and LM are the same object. Examples of Refl-up are: bottle up, connect up, couple up, curl up and roll up.</Paragraph>
      <Paragraph position="12"/>
    </Section>
    <Section position="3" start_page="47" end_page="47" type="sub_section">
      <SectionTitle>
3.3 The Sense Classes for Our Study
</SectionTitle>
      <Paragraph position="0"> Adopting a cognitive linguistic perspective, we assume that all uses of a particle make some compositional contribution of meaning to a VPC. In this work, we classify target VPCs according to which of the above senses of up is contributed to the expression. For example, the expressions jump up and pick up are designated as being in the class Vert-up since up in these VPCs has the vertical sense, while clean up and drink up are designated as being in the class Cmpl-up since up here has the completive sense. The relations among the senses of up can be shown in a &amp;quot;schematic network&amp;quot; (Langacker, 1987). Figure 4 shows a simplification of such a network in which we connect more similar senses with shorter edges. This type of analysis allows us to alter the granularity of our classification in a linguistically motivated fashion by combining closely related senses. Thus we can explore the effect of different sense granularities on classification.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="47" end_page="49" type="metho">
    <SectionTitle>
4 Materials and Methods
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="47" end_page="48" type="sub_section">
      <SectionTitle>
4.1 Experimental Expressions
</SectionTitle>
      <Paragraph position="0"> We created a list of English VPCs using up, based on a list of VPCs made available by McIntyre (2001) and a list of VPCs compiled by two human judges. The judges then filtered this list to include only VPCs which they both agreed were valid, resulting in a final list of 389 VPCs. From this list, training, verification and test sets of sixty VPCs each are randomly selected. Note that the expense of manually annotating the data (as described below) prevents us from using larger datasets in this initial investigation. The experimental sets are  chosen such that each includes the same proportion of verbs across three frequency bands, so that the sets do not differ in frequency distribution of the verbs. (We use frequency of the verbs, rather than the VPCs, since many of our features are based on the verb of the expression, and moreover, VPC frequency is approximate.) The verification data is used in exploration of the feature space and selection of final features to use in testing; the test set is held out for final testing of the classifiers. Each VPC in each dataset is annotated by the two human judges according to which of the four senses of up identified in Section 3.2 is contributed to the VPC. As noted in Section 1, VPCs may be ambiguous with respect to their particle sense.</Paragraph>
      <Paragraph position="1"> Since our task here is type classification, the judges identify the particle sense of a VPC in its predominant usage, in their assessment. The observed inter-annotator agreement is a0a1a0a3a2 a0 for each dataset. The unweighted observed kappa scores are a0a1a0a5a4a7a6 , a0a1a0a3a8a10a9 and a0a1a0 a1 a1 , for the training, verification and test sets respectively.</Paragraph>
    </Section>
    <Section position="2" start_page="48" end_page="48" type="sub_section">
      <SectionTitle>
4.2 Calculation of the Features
</SectionTitle>
      <Paragraph position="0"> We extract our features from the 100M word British National Corpus (BNC, Burnard, 2000).</Paragraph>
      <Paragraph position="1"> VPCs are identified using a simple heuristic based on part-of-speech tags, similar to one technique used by Baldwin (2005). A use of a verb is considered a VPC if it occurs with a particle (tagged AVP) within a six word window to the right. Over a random sample of 113 VPCs thus extracted, we found 88% to be true VPCs, somewhat below the performance of Baldwin's (2005) best extraction method, indicating potential room for improvement. null The slot and particle features are calculated using a modified version of the ExtractVerb software provided by Joanis and Stevenson (2003), which runs over the BNC pre-processed using Abney's (1991) Cass chunker.</Paragraph>
      <Paragraph position="2"> To compute the word co-occurrence features (WCFs), we first determine the relative frequency of all words which occur within a five word window left and right of any of the target expressions in the training data. From this list we eliminate the most frequent 1% of words as a stoplist and then use the next a11 most frequent words as &amp;quot;feature words&amp;quot;. For each &amp;quot;feature word&amp;quot;, we then calculate its relative frequency of occurrence within the same five word window of the target expres- null to create feature sets WCFa17a19a18a19a18 and WCFa20a19a18a19a18 respectively. null</Paragraph>
    </Section>
    <Section position="3" start_page="48" end_page="49" type="sub_section">
      <SectionTitle>
4.3 Experimental Classes
</SectionTitle>
      <Paragraph position="0"> Table 1 shows the distribution of senses in each dataset. Each of the training and verification sets has only one VPC corresponding to Goal-up. Recall that Goal-up shares a schema with Cmpl-up, and is therefore very close to it in meaning, as indicated spatially in Figure 4. We therefore merge Goal-up and Cmpl-up into a single sense, to provide more balanced classes.</Paragraph>
      <Paragraph position="1"> Since we want to see how our features perform on differing granularities of sense classes, we run each experiment as both a 3-way and 2-way classification task. In the 3-way task, the sense classes correspond to the meanings Vert-up, Goal-up merged with Cmpl-up (as noted above), and Refl-up, as shown in Table 2. In the 2-way task, we further merge the classes corresponding to Goal- null up/Cmpl-up with that of Refl-up, as shown in Table 3. We choose to merge these classes because (as illustrated in Figure 4) Refl-up is a sub-sense of Goal-up, and moreover, all three of these senses contrast with Vert-up, in which increase along a vertical axis is the salient property. It is worth emphasizing that the 2-way task is not simply a classification between literal and non-literal up--Vertup includes extensions of up in which the increase along a vertical axis is metaphorical.</Paragraph>
    </Section>
    <Section position="4" start_page="49" end_page="49" type="sub_section">
      <SectionTitle>
4.4 Evaluation Metrics and Classifier
Software
</SectionTitle>
      <Paragraph position="0"> The variation in the frequency of the sense classes of up across the datasets makes the true distribution of the classes difficult to estimate. Furthermore, there is no obvious informed baseline for this task. Therefore, we make the assumption that the true distribution of the classes is uniform, and use the chance accuracy a0a2a1a4a3 as the baseline (where a3 is the number of classes--in our experiments, either a15 or a6 ). Accordingly, our measure of classification accuracy should weight each class evenly. Therefore, we report the average per class accuracy, which gives equal weight to each class.</Paragraph>
      <Paragraph position="1"> For classification we use LIBSVM (Chang and Lin, 2001), an implementation of a support-vector machine. We set the input parameters, cost and gamma, using 10-fold cross-validation on the training data. In addition, we assign a weight of a5 a6a8a7a10a9a12a11a14a13a16a15a18a17a20a19a22a21 a7a10a15a18a15a23a5 a5a24a19a22a21 a7a10a15a18a15a26a25a23a5 to each class a27 to eliminate the effects of the variation in class size on the classifier. Note that our choice of accuracy measure and weighting of classes in the classifier is necessary given our assumption of a uniform random baseline. Since the accuracy values we report incorporate this weighting, these results cannot be compared to a baseline of always choosing the most frequent class.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML