File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/97/w97-0605_evalu.xml

Size: 1,536 bytes

Last Modified: 2025-10-06 14:00:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0605">
  <Title>AUTOMATIC LEXICON ENHANCEMENT BY MEANS OF CORPUS TAGGING</Title>
  <Section position="8" start_page="31" end_page="31" type="evalu">
    <SectionTitle>
5.3 Results
</SectionTitle>
    <Paragraph position="0"> We verified manually the first 1000 most frequent OOV words of each filtered lexicon. The results are presented as follows : Table 1 shows, in the column &amp;quot;Correct&amp;quot;, the percentage of OOV words where all the labels were correct ; the column &amp;quot;Wrong&amp;quot; indicates the percentage of words which were labelled with at least one incorrect tag.</Paragraph>
    <Paragraph position="1">  Table 2 details, for the common-words lexicon, the results obtained on the correct words. The column &amp;quot;All classes&amp;quot; shows the percentage of correct words which had all their possible syntactic categories in the lexicon. The column &amp;quot;Missing classes&amp;quot; indicates the percentage of correct words which could have received more syntactic categories than those stored in the lexicon.</Paragraph>
    <Paragraph position="2"> Table 2 \] All classes I Missing classes Common-words I 79% \] 21% These results show that the criteria used to filter the OOV lexicons allows us to produce reliable lexicons (only 4% of the OOV common-words contained label errors). By keeping the 1000 most frequent words of each lexicon, we reduced by 20% the lack of coverage of our general lexicon on all the test corpus.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML