File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/w06-3604_concl.xml

Size: 1,822 bytes

Last Modified: 2025-10-06 13:55:53

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3604">
  <Title>All-word prediction as the ultimate confusable disambiguation</Title>
  <Section position="8" start_page="31" end_page="31" type="concl">
    <SectionTitle>
6 Discussion
</SectionTitle>
    <Paragraph position="0"> In this article we explored the scaling abilities of IGTREE, a simple decision-tree algorithm with favorable asymptotic complexities with respect to multi-label classification tasks. IGTREE is applied to word prediction, a task for which virtually unlimited amounts of training examples are available, with very large amounts of predictable class labels; and confusable disambiguation, a specialization of word prediction focusing on small sets of confusable words. Best results are 42.2% correctly predicted tokens (words and punctuation markers) when training and testing on data from the Reuters newswire corpus; and confusable disambiguation accuracies of well above 90%. Memory requirements and speeds were shown to be realistic.</Paragraph>
    <Paragraph position="1"> Analysing the results of the learning curve experiments with increasing amounts of training examples, we observe that better word prediction accuracy can be attained simply by adding more training examples, and that the progress in accuracy proceeds at a log-linear rate. The best rate we observed was an 8% increase in performance every tenfold multiplication of the number of training examples, when training and testing on the same data.</Paragraph>
    <Paragraph position="2"> Despite the fact that all-words prediction lags behind in disambiguating confusibles, in comparison with classifiers that are focused on disambiguating single sets of confusibles, we see that this lag is only relative to the amount of training material available.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML