XML Viewer - p92-1021

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/92/p92-1021_evalu.xml
Size: 6,270 bytes
Last Modified: 2025-10-06 14:00:07
<?xml version="1.0" standalone="yes"?>
<Paper uid="P92-1021">
  <Title>LATTICE-BASED WORD IDENTIFICATION IN CLARE</Title>
  <Section position="7" start_page="162" end_page="162" type="evalu">
    <SectionTitle>
5 AN EVALUATION
</SectionTitle>
    <Paragraph position="0"> To assess the usefulness of syntactico-semantic constraints in CLARE's spelling correction, the following experiment, intended to simulate performance (typographic) errors, was carried out. Five hundred sentences, of up to ten words in length, falling within CLARE's current core lexical (1600 root forms) and grammatical coverage were taken at random from the LOB corpus. These sentences were passed, character by character, through a channel which transmitted a character without alteration with probability 0.99, and with probability 0.01 introduced a simple error. The relative probabilities of the four different kinds of error were deduced from Table X of Pollock and Zamora, 1984; where a new character had to be inserted or substituted, it was selected at random from the original sentence set. This process produced a total of 102 sentences that differed from their originals. The average length was 6.46 words, and there were 123 corrupted tokens in all, some containing more than one simple error.</Paragraph>
    <Paragraph position="1"> Because longer sentences were more likely to be changed, the average length of a changed sentence was some 15% more than that of an original one.</Paragraph>
    <Paragraph position="2"> The corrupted sentence set was then processed by CLARE with only the spelling correction recovery method in force and with no user intervention. Up to two simple errors were considered per token. No domain-specific or context-dependent knowledge was used.</Paragraph>
    <Paragraph position="3"> Of the 123 corrupted tokens, ten were corrupted into other known words, and so no correction was attempted. Parsing failed in nine of these cases; in the tenth, the corrupted word made as much sense as the original out of discourse context. In three further cases, the original token was not suggested as a correction; one was a special form, and for the other two, alternative corrections involved fewer simple errors. The corrections for two other tokens were not used because a corruption into a known word elsewhere in the same sentence caused parsing to fail.</Paragraph>
    <Paragraph position="4"> Only one correction (the right one) was suggested for 59 of the remaining 108 tokens.</Paragraph>
    <Paragraph position="5"> Multiple-token correction, involving the manipulation of space characters, took place in 24 of these cases.</Paragraph>
    <Paragraph position="6"> This left 49 tokens for which more than one correction was suggested, requiring syntactic and semantic processing for further disambiguation. The average number of corrections suggested for these 49 was 4.57. However, only an average of 1.69 candidates (including, because of the way the corpus was selected, all the right ones) appeared in QLFs satisfying selectional restrictions; thus only 19% of the wrong candidates found their way into any QLF. If, in the absence of frequency information, we take all candidates as equally likely, then syntactic and semantic processing reduced the average entropy from 1.92 to 0.54, removing 72% of the uncertainty (see Carter, 1987, for a discussion of why entropy is the best measure to use in contexts like this).</Paragraph>
    <Paragraph position="7"> When many QLFs are produced for a sentence, CLARE orders them according to a set of scoring functions encoding syntactic and semantic preferences. For the 49 multiplecandidate tokens, removing all but the best-scoring QLF(s) eliminated 7 (21%) of the 34 wrong candidates surviving to the QLF stage; however, it also eliminated 5 (10~) of the right candidates. It is expected that future development of the scoring functions will further improve these figures, which are summarized in Table 1.</Paragraph>
    <Paragraph position="8"> The times taken to parse lattices containing multiple spelling candidates reflect the characteristics of CLARE's parser, which uses a  backtracking, left-corner algorithm and stores well-formed constituents so as to avoid repeating work where possible. In general, when a problem token appears late in the sentence and/or when several candidate corrections axe syntactically plausible, the lattice approach is several times faster than processing the alternative strings separately (which tends to be very time-consuming). When the problem token occurs early and has only one plausible correction, the two methods are about the same speed.</Paragraph>
    <Paragraph position="9"> For example, in one case, a corrupted token with 13 candidate corrections occurred in sixth position in an eight-word sentence.</Paragraph>
    <Paragraph position="10"> Parsing the resulting lattice was three times faster than parsing each alternative full string separately. The lattice representation avoided repetition of work on the first six words, tIowever, in another case, where the corrupted token occurred second in an eight-word sentence, and had six candidates, only one of which was syntactically plausible, the lattice representation was no faster, as the incorrect candidates in five of the strings led to the parse being abandoned early.</Paragraph>
    <Paragraph position="11"> An analogous experiment was carried out with 500 sentences from the same corpus which CLARE could not parse. 131 of the sentences, with average length 7.39 words, suffered the introduction of errors. Of these, only seven (5%) received a parse. Four of the seven received no sortally valid QLFs, leaving only three (2%) &amp;quot;false positives&amp;quot;. This low figure is consistent with the results from the originaJly parseable sentence set; nine out of the ten corruptions into known words in that experiment led to parse failure, and only 19% of wrong suggested candidates led to a sortallyvalid QLF. If, as those figures suggest, the replacement of one word by another only rarely maps one sentence inside coverage to another, then a corresponding replacement on a sentence outside coverage should yield something within coverage even more rarely, and this does appear to be the case.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML