File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/00/w00-1209_concl.xml
Size: 2,145 bytes
Last Modified: 2025-10-06 13:52:55
<?xml version="1.0" standalone="yes"?> <Paper uid="W00-1209"> <Title>The Research of Word Sense Disambiguation Method Based on Co-occurrence Frequency of Hownet*</Title> <Section position="6" start_page="63" end_page="63" type="concl"> <SectionTitle> 4. Experiment And Analysis </SectionTitle> <Paragraph position="0"> We did the experiment on a corpus of 10,000 characters from People's Dialy.</Paragraph> <Paragraph position="1"> Firstly, the corpus is segmented, and then the sememe co-occurrence frequecny database and mutual information database is created. In the mutual-informationdatabase, there is 709,496 data items corresponding to different sememes pairs. In order to speeding up the processing, the mutual-information database was sorted and indexed according to the first two bytes of each sememe pair. At last the experiment of disambiguation of some polysemous words was done. Here is two examples: We use the following euqation to access the accuracy ratio of disambiguafion: accuracy ratio = the number of correctlytagged example~ the total number of examplesin testing se; (11) the experimental result is shown in table 4. Tab~3: Two examples that disambiguate using sememe co-occurrence frequency database The definition of word &quot; ~ &quot; The score of sense items and the context of word &quot;~&quot; in example 1 The score of sense items and the context of word &quot;~&quot; in example 2 ~3~'-~ 14. 459068 8. 659968 Z~'F~ 9. 817648 i0. 817648 M'I~ -~ 7. 415986 12. 415986 ~ ~ -0. 134779 -0. 134779 i,~3~: :k~/..W... ~ji,~&quot; ~.i..~$.~ -0. 818518 -0. 818518 ~9k:~ 14. 459068 12. 415986 Total number of testing The number of correctly Accurracy examples tagged examples rat i o Close test I00 75 75% Open test I00 71 71% The disambiguation method introduced above have the following charatristics: (1) The problem of data spraseness is solved in a large degree.</Paragraph> <Paragraph position="2"> (2) This disambiguation method avoids the laborious hand tagging of training corpus. (3) This method can been easily applied to other kind of corpus.</Paragraph> </Section> class="xml-element"></Paper>