File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/h92-1022_evalu.xml
Size: 5,368 bytes
Last Modified: 2025-10-06 14:00:08
<?xml version="1.0" standalone="yes"?> <Paper uid="H92-1022"> <Title>A SIMPLE RULE-BASED PART OF SPEECH TAGGER</Title> <Section position="4" start_page="113" end_page="114" type="evalu"> <SectionTitle> 3. RESULTS </SectionTitle> <Paragraph position="0"> The tagger was tested on 5% of the Brown Corpus including sections from every genre. First, the test corpus was tagged by the simple lexical tagger. Next, each of the patches was in turn applied to the corpus. Below is a graph showing the improvement in accuracy from applying patches. It is significant that with only 71 patches, an error rate of 5.1% was obtained 5. Of the 71 patches, 66 resulted in a reduction in the number of errors in the test corpus, 3 resulted in no net change, and 2 resulted in a higher number of errors. Almost all patches which were effective on the training corpus were also effective on the test corpus.</Paragraph> <Paragraph position="1"> Unfortunately, it is difficult to compare our results with other published results. In \[12\], an error rate of 3-4% on one domain, Wall Street Journal articles and 5.6% on another domain, texts on terrorism in Latin American countries, is quoted. However, both the domains and the tag set are different from what we use. \[1\] reports an accuracy of &quot;95-99% correct, depending on the definition of correct&quot;. We implemented a version of the 5We ran the experiment three times. Each time we divided the corpus into training, patch and test sets in a different way. All three runs gave an error rate of 5%.</Paragraph> <Paragraph position="2"> algorithm described in \[1\] which did not make use of a dictionary to extend its lexical knowledge. When trained and tested on the same samples used in our experiment, we found the error rate to be about 4.5%. \[3\] quotes a 4% error rate when testing and training on the same text. \[6\] reports an accuracy of 96-97%. Their probabilistic tagger has been augmented with a handcrafted procedure to pretag problematic &quot;idioms&quot;. This procedure, which requires that a list of idioms be laboriously created by hand, contributes 3% toward the accuracy of their tagger, according to \[3\]. The idiom list would have to be rewritten if one wished to use this tagger for a different tag set or a different corpus. It is interesting to note that the information contained in the idiom list can be automatically acquired by the rule-based tagger.</Paragraph> <Paragraph position="3"> For example, their tagger had difficulty tagging as old as. An explicit rule was written to pretag as old as with the proper tags. According to the tagging scheme of the Brown Corpus, the first as should be tagged as a qualifier, and the second as a subordinating conjunction. In the rule-based tagger, the most common tag for as is subordinating conjunction. So initially, the second as is tagged correctly and the first as is tagged incorrectly. To remedy this, the system acquires the patch: if the current word is tagged as a subordinating conjunction, and so is the word two positions ahead, then change the tag of the current word to qualifierfi The rule-based tagger has automatically learned how to properly tag this &quot;idiom.&quot; Regardless of the precise rankings of the various taggers, we have demonstrated that a simple rule-based tagger with very few rules performs on par with stochastic taggers. It should be mentioned that our results were obtained without the use of a dictionary. Incorporating a large dictionary into the system would improve performance in two ways. First, it would increase the accuracy in tagging words not seen in the training corpus, since part of speech information for some words not appearing in the training corpus can be obtained from the dictionary. Second, it would increase the error reduction resuiting from applying patches. When a patch indicates that a word should be tagged with tagb instead of taga, the tag is only switched if the word was tagged with tagb somewhere in the training corpus. Using a dictionary would provide more accurate knowledge about the set of permissible part of speech tags for a particular word.</Paragraph> <Paragraph position="4"> We plan to incorporate a dictionary into the tagger in the future.</Paragraph> <Paragraph position="5"> As an estimate of the improvement possible by using a dictionary, we ran two experiments where all words were known by the system. First, the Brown Corpus 6This was one of the 71 patches acquired by the rule-based tagger.</Paragraph> <Paragraph position="6"> was divided into a training corpus of about one million words, a patch corpus of about 65,000 words and a test corpus of about 65,000 words. Patches were acquired as described above. When tested on the test corpus, with lexical information derived solely from the training corpus, the error rate was 5%. Next, the same patches wele used, but lexical information was gathered from the entire Brown Corpus. This reduced the error rate to 4.1%. Finally, the same experiment was run with lexical information gathered solely from the test corpus. This resulted in a 3.5% error rate. Note that the patches used in the two experiments with no unknown words were not the optimal patches for these tests, since they were derived from a corpus that contained unknown words.</Paragraph> </Section> class="xml-element"></Paper>