File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1004_intro.xml
Size: 3,758 bytes
Last Modified: 2025-10-06 14:01:48
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1004"> <Title>Fast Methods for Kernel-based Text Analysis</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> Kernel methods (e.g., Support Vector Machines (Vapnik, 1995)) attract a great deal of attention recently. In the field of Natural Language Processing, many successes have been reported. Examples include Part-of-Speech tagging (Nakagawa et al., 2002) Text Chunking (Kudo and Matsumoto, 2001), Named Entity Recognition (Isozaki and Kazawa, 2002), and Japanese Dependency Parsing (Kudo and Matsumoto, 2000; Kudo and Matsumoto, 2002).</Paragraph> <Paragraph position="1"> It is known in NLP that combination of features contributes to a significant improvement in accuracy.</Paragraph> <Paragraph position="2"> For instance, in the task of dependency parsing, it would be hard to confirm a correct dependency relation with only a single set of features from either a head or its modifier. Rather, dependency relations should be determined by at least information from both of two phrases. In previous research, feature combination has been selected manually, and the performance significantly depended on these selections. This is not the case with kernel-based methodology. For instance, if we use a polynomial kernel, all feature combinations are implicitly expanded without loss of generality and increasing the computational costs. Although the mapped feature space is quite large, the maximal margin strategy (Vapnik, 1995) of SVMs gives us a good generalization performance compared to the previous manual feature selection. This is the main reason why kernel-based learning has delivered great results to the field of NLP.</Paragraph> <Paragraph position="3"> Kernel-based text analysis shows an excellent performance in terms in accuracy; however, its inefficiency in actual analysis limits practical application. For example, an SVM-based NE-chunker runs at a rate of only 85 byte/sec, while previous rule-based system can process several kilobytes per second (Isozaki and Kazawa, 2002). Such slow execution time is inadequate for Information Retrieval, Question Answering, or Text Mining, where fast analysis of large quantities of text is indispensable.</Paragraph> <Paragraph position="4"> This paper presents two novel methods that make the kernel-based text analyzers substantially faster.</Paragraph> <Paragraph position="5"> These methods are applicable not only to the NLP tasks but also to general machine learning tasks where training and test examples are represented in a binary vector.</Paragraph> <Paragraph position="6"> More specifically, we focus on a Polynomial Kernel of degree d, which can attain feature combinations that are crucial to improving the performance of tasks in NLP. Second, we introduce two fast classification algorithms for this kernel. One is PKI (Polynomial Kernel Inverted), which is an extension of Inverted Index in Information Retrieval. The other is PKE (Polynomial Kernel Expanded), where all feature combinations are explicitly expanded. By applying PKE, we can convert a kernel-based classifier into a simple and fast liner classifier. In order to build PKE, we extend the PrefixSpan (Pei et al., 2001), an efficient Basket Mining algorithm, to enumerate effective feature combinations from a set of support examples.</Paragraph> <Paragraph position="7"> Experiments on English BaseNP Chunking, Japanese Word Segmentation and Japanese Dependency Parsing show that PKI and PKE perform respectively 2 to 13 times and 30 to 300 times faster than standard kernel-based systems, without a discernible change in accuracy.</Paragraph> </Section> class="xml-element"></Paper>