File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/e99-1018_concl.xml
Size: 3,082 bytes
Last Modified: 2025-10-06 13:58:21
<?xml version="1.0" standalone="yes"?> <Paper uid="E99-1018"> <Title>POS Disambiguation and Unknown Word Guessing with Decision Trees</Title> <Section position="8" start_page="2638" end_page="2638" type="concl"> <SectionTitle> 6 Discussion and Future Goals </SectionTitle> <Paragraph position="0"> We have shown a uniform approach to the dual problem of POS disambiguation and unknown word guessing as it appears in M. Greek, reinforcing the argument that &quot;machine-learning researchers should become more interested in NLP as an application area&quot; (Daelemans et al., 1997). As a general remark, we argue that the linguistic approach has good performance when the knowledge or the behavior of a language can be defined explicitly (by means of lexicons, syntactic grammars, etc.), whereas empirical (corpus-based statistical) learning should apply when exceptions, complex interaction or ambiguity arise. In addition, there is always the opportunity to bias empirical learning with linguistically motivated parameters, so as to 7 In this method, a dataset is partitioned 10 times into 90% training material and 10% testing material.</Paragraph> <Paragraph position="1"> Average accuracy provides a reliable estimate of the generalization accuracy.</Paragraph> <Paragraph position="2"> meet the needs of the specific language problem.</Paragraph> <Paragraph position="3"> Based on these statements, we combined a high-coverage lexicon and a set of empirically induced decision trees into a POS tagger achieving ~5,5% error rate for POS disambiguation and ~16% error rate for unknown word guessing.</Paragraph> <Paragraph position="4"> The decision-tree approach outperforms both the naive approach of assigning the most frequent POS, as well as the ~20% error rate obtained by the n-gram tagger for M. Greek presented in (Dermatas and Kokkinakis, 1995).</Paragraph> <Paragraph position="5"> Comparing our tree-induction algorithm and IGTREE, the algorithm used in MBT (Daelemans et al., 1996), their main difference is that IGTREE produces oblivious decision trees by supplying an a priori ordered list of best features instead of re-computing the best feature during each branching, which is our case. After applying IGTREE to the datasets described in Section 3, we measured similar performance (-7% error rate for disambiguation and -17% for guessing). Intuitively, the global search for best features performed by IGTREE has similar results to the local searches over the fragmented datasets performed by our algorithm.</Paragraph> <Paragraph position="6"> Our goals hereafter aim to cover the following: * Improve the POS tagging results by: a) finding the optimal feature set for each ambiguity scheme and b) increasing the lexicon coverage.</Paragraph> <Paragraph position="7"> * Analyze why IGTREE is still so robust when, obviously, it is built on less information.</Paragraph> <Paragraph position="8"> * Apply the same approach to resolve Gender, Case, Number, etc. ambiguity and to guess such attributes for unknown words.</Paragraph> </Section> class="xml-element"></Paper>