File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/e06-1034_concl.xml

Size: 2,386 bytes

Last Modified: 2025-10-06 13:55:07

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-1034">
  <Title>From detecting errors to automatically correcting them</Title>
  <Section position="8" start_page="270" end_page="271" type="concl">
    <SectionTitle>
6 Summary and Outlook
</SectionTitle>
    <Paragraph position="0"> We have demonstrated the effectiveness of using POS tagging technology to correct a corpus, once an error detection method has identified potentially erroneous corpus positions. We first showed that using a tagger as is provides moderate results, but adapting atagger toaccount forproblematic tag distinctions in the data--i.e., using complex ambiguity tags--performs much better and  reduces the true error rate of a corpus. Thedistinctions in the tagging model have more of an impact on the precision of correction than the underlying tagging algorithm.</Paragraph>
    <Paragraph position="1"> Despite the gain in accuracy, we pointed out that there are still several residual problems which are difficult for any tagging system. Future work will go into automatically sorting the tags so that the difficult disambiguation decisions can be dealt with differently from the easily disambiguated corpus positions. Additionally, we will want to test the method on a variety of corpora and tagging schemes and gauge the impact of correction on POS tagger training and evaluation. We hypothesize that this method will work for any tagset with potentially confusing distinctions between tags, but this is yet to be tested.</Paragraph>
    <Paragraph position="2"> The method of adapting a tagging model by using complex ambiguity tags originated from an understanding that the POS tagging process is crucially dependent upon the tagset distinctions.</Paragraph>
    <Paragraph position="3"> Based on this, the correction work described in this paper can be extended to the general task of POStagging, as atagger using complex ambiguity classes is attempting to tackle the difficult distinctions in a corpus. To pursue this line of research, work has to go into defining ambiguity classes for all words in the corpus, instead of focusing on words involved in variations.</Paragraph>
    <Paragraph position="4"> Acknowledgments I would like to thank Detmar Meurers for helpful discussion, Stephanie Dickinson for her statistical assistance, and the three anonymous reviewers for their comments.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML