File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/92/c92-2070_intro.xml

Size: 2,209 bytes

Last Modified: 2025-10-06 14:05:13

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2070">
  <Title>Walker, Donald (1987), &amp;quot;Ka\]owledge Resource Tools for Aeo~'ssing</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
2, Proposed Method
</SectionTitle>
    <Paragraph position="0"> The strategy proposed here is based on the following three observations: 1) Different conceptual classes of words, such as AmMALS or MACH~mS tend to appear in recognizably different contexts. 2) Different word senses tend to belong to different conceptual classes (crane can be an ANIMAL or a MACHINE). 3) If one can build a context discriminator for the conceptual classes, one has effectively built a context discriminator for the word senses that are members of those classes. Furthermore, the context indicators for a Roget category (e.g. gear, piston and engine for the category TOOLS/MACHINERY) will also tend to be context indicators for the members of that category (such as the machinery sense of crane).</Paragraph>
    <Paragraph position="1"> ACRES DE COLING-92. NA~rrES, 23-28 AO~r 1992 4 5 4 PRec. OF COLING-92, NANTES, AUG. 23-28, 1992 We attempt to identify, weight and utilize th~e indicative words &amp;s follows. For each of the 1042 Roget Categories:  1. Collect contexts which are representative of the Roget category 2. Identify salient words in the collective context and determine weights for each word, and 3. Use the resulting weights to predict the appropriate category for a polysemous word occurring in novel text.</Paragraph>
    <Paragraph position="2"> 2.1 Step 1: Collect Contexts which are  Representative of the Roget category The goal of this step is to collect a set of words t/tat are typically found in the context of a Roget category. To do this, we extract concordances of 100 surrounding words for e2~h occurrence of each member of the category ill the corpus. Below is a sample set of partial concordances for words in the category &amp;quot;IOOLS.tMACIIINERY (348). The complete set contains 30,924 lines, selected from the particular training corpus used in this study, the 10 million word. June 1991 electronic version of Grolier's Encyclopedia.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML