File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/a97-1025_intro.xml
Size: 1,952 bytes
Last Modified: 2025-10-06 14:06:17
<?xml version="1.0" standalone="yes"?> <Paper uid="A97-1025"> <Title>Contextual Spelling Correction Using Latent Semantic Analysis</Title> <Section position="4" start_page="0" end_page="166" type="intro"> <SectionTitle> 2 Related Work </SectionTitle> <Paragraph position="0"> Latent Semantic Analysis has been applied to the problem of spelling correction previously (Kukich, 1992b). However, this work focused on detecting misspelled words, not contextual spelling errors. The approach taken used letter n-grams to build the semantic space. In this work, we use the words directly. null Yarowsky (1994) notes that conceptual spelling correction is part of a closely related class of problems which include word sense disambiguation, word choice selection in machine translation, and accent and capitalization restoration. This class of problems has been attacked by many others. A number of feature-based methods have been tried, including Bayesian classifiers (Gale, Church, and Yarowsky, 1992; Golding, 1995), decision lists (Yarowsky, 1994), and knowledge-based approaches (McRoy, 1992). Recently, Golding and Schabes (1996) described a system, Tribayes, that combines a trigram model of the words' parts of speech with a Bayesian classifier. The trigram component of the system is used to make decisions for those confusion sets that contain words with different parts of speech. The Bayesian component is used to predict the correct word from among same part-of-speech words.</Paragraph> <Paragraph position="1"> Golding and Schabes selected 18 confusion sets from a list of commonly confused words plus a few that represent typographical errors. They trained their system using a random 80% of the Brow\[/corpus (Ku~era and Francis, 1967). The remaining 20% of the corpus was used to test how well the system performed. We have chosen to use the same 18 confusion sets and the Brown corpus in order to compare</Paragraph> </Section> class="xml-element"></Paper>