File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/90/c90-2067_metho.xml

Size: 3,592 bytes

Last Modified: 2025-10-06 14:12:27

<?xml version="1.0" standalone="yes"?>
<Paper uid="C90-2067">
  <Title>Word Sense Disambiguation with Very Large Neural Networks Extracted from Machine Readable Dictionaries</Title>
  <Section position="4" start_page="390" end_page="393" type="metho">
    <SectionTitle>
3.2. Results
</SectionTitle>
    <Paragraph position="0"> The network finds the correct sense in cases where Lesk's strategy succeeds. For example, if the input consists of pen and sheep, pen 2.1 and sheep 1 are correctly activated. More interestingly, the network selects &amp;quot; the appropriate senses in cases where Lesk's strategy fails.</Paragraph>
    <Paragraph position="1"> Figures 3 and 4 show the state of the network after being run with pen and goat, and pen and page, respectively. The figures represent only the most activated part of each network after 100 cycles. Over the course of the run, the network reinforces only a small cluster of the most semantically relevant words and senses, and filters out tile rest of the thousands of nodes. The correct sense for each word in each context (pen 2.1 with goat 1, and pen 1.1 withpage 1.1) is the only one activated at the end of the run.</Paragraph>
    <Paragraph position="2"> This model solves the context-setting problem mentioned above without any use of microfeatures. Sense 1.1 of pen would also be activated if it appeared in the context of a large number of other words--e.g., book, ink, inkwell, pencil, paper, write, draw, sketch, etc.--which have a similar semantic relationship to pen. For example, figure 5 shows the state of the network after being run with pen and book. It is apparent that the subset of nodes activated is similar to those which were activated by page.</Paragraph>
    <Paragraph position="3">  The examples given here utilize only two words as input, in order to show clearly the behavior of the network. In fact, the performance of the network improves with additional input, since additional context can only contribute more to the disambiguation process. For example, given the sentence The young page put the sheep in the pen, the network correctly chooses the correct senses of page (2.3: &amp;quot;a youth in personal service&amp;quot;), sheep (1), and pen (2.1). This example is particularly difficult, because page and sheep compete against each other to activate different senses of pen, as demonstrated in the examples above. However, the word young reinforces sense 2.3 of page, which enables sheep to win the struggle. Inter-sentential context could be used as well, by retaining the most activated nodes within the network during subsequent runs.</Paragraph>
    <Paragraph position="4"> By running various experiments on VLNNs, we have discovered that when the simple models proposed so far are scaled up, several improvements are necessary. We have, for instance, discovered that &amp;quot;gang effects&amp;quot; appear due to extreme imbalance among words having few senses and hence few connections, and words containing up to 80 senses and several hundred connections, and that therefore dampening is required. tn addition, we have found that is is necessary to treat a word node and its sense nodes as a complex, ecological unit rather than as separate entities. In our model, word nodes corttrol the behavior of sense nodes by means of a differential neuron that prevents, for example, a sense node from becoming more activated than its master word node. Our experimentation with VLNNs has also shed light on the role of and need for various other parameters, such as thresholds, decay, etc.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML