File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0824_intro.xml

Size: 2,303 bytes

Last Modified: 2025-10-06 14:02:34

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0824">
  <Title>Multi-Component Word Sense Disambiguationa0</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2 Features
</SectionTitle>
    <Paragraph position="0"> We used a set of features similar to that which was extensively described and evaluated in (Yoong and Hwee, 2002). The sentence with POS annotation &amp;quot;A-DT newspaper-NN and-CC now-RB a-DT bank-NN have-AUX since-RB taken-VBN over-RB&amp;quot; serves as an example to illustrate them. The word to disambiguate is bank (or activate for (7)).</Paragraph>
    <Paragraph position="1">  1. part of speech of neighboring words a4a6a5 ,</Paragraph>
    <Paragraph position="3"/>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
Association for Computational Linguistics
</SectionTitle>
      <Paragraph position="0"> for the Semantic Analysis of Text, Barcelona, Spain, July 2004 SENSEVAL-3: Third International Workshop on the Evaluation of Systems</Paragraph>
      <Paragraph position="2"> 4. syntactically governing elements under a phrase a74a17a34 ; e.g., a74 a34 a36a50a81a46a57a22a82a6a59 a54 5. syntactically governed elements under a phrase a74a6a76 ; e.g., a74a41a76a43a36 a57 a44a41a4 , a74a41a76a77a36a39a65a55a61 a67 a44a41a4  6. coordinates a12a9a12 ; e.g., a12a9a12 a36a39a65a55a59a28a67a33a68a71a70a55a57a28a70a55a59a28a63 7. features for verbs, e.g, &amp;quot;... activate the pressure&amp;quot;: a73 number of arguments</Paragraph>
      <Paragraph position="4"> The same features were extracted from the given test and training data, and the additional dataset.</Paragraph>
      <Paragraph position="5"> POS and other syntactic features were extracted from parse trees. Training and test data, and the Wordnet glosses, were parsed with Charniak's parser (Charniak, 2000). Open class words were morphologically simplified with the &amp;quot;morph&amp;quot; function from the Wordnet library &amp;quot;wn.h&amp;quot;. When it was not possible to identify the noun or verb in the glosses 2 we only extracted a limited set of features: WS, WC, and morphological features. Each gloss provides one training instance per synset. Overall we found approximately 200,000 features.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML