File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/03/p03-1005_intro.xml
Size: 3,323 bytes
Last Modified: 2025-10-06 14:01:49
<?xml version="1.0" standalone="yes"?> <Paper uid="P03-1005"> <Title>Hierarchical Directed Acyclic Graph Kernel: Methods for Structured Natural Language Data</Title> <Section position="2" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> As it has become easy to get structured corpora such as annotated texts, many researchers have applied statistical and machine learning techniques to NLP tasks, thus the accuracies of basic NLP tools, such as POS taggers, NP chunkers, named entities taggers and dependency analyzers, have been improved to the point that they can realize practical applications in NLP.</Paragraph> <Paragraph position="1"> The motivation of this paper is to identify and use richer information within texts that will improve the performance of NLP applications; this is in contrast to using feature vectors constructed by a bag-of-words (Salton et al., 1975).</Paragraph> <Paragraph position="2"> We now are focusing on the methods that use numerical feature vectors to represent the features of natural language data. In this case, since the original natural language data is symbolic, researchers convert the symbolic data into numeric data. This process, feature extraction, is ad-hoc in nature and differs with each NLP task; there has been no neat formulation for generating feature vectors from the semantic and grammatical structures inside texts.</Paragraph> <Paragraph position="3"> Kernel methods (Vapnik, 1995; Cristianini and Shawe-Taylor, 2000) suitable for NLP have recently been devised. Convolution Kernels (Haussler, 1999) demonstrate how to build kernels over discrete structures such as strings, trees, and graphs. One of the most remarkable properties of this kernel methodology is that it retains the original representation of objects and algorithms manipulate the objects simply by computing kernel functions from the inner products between pairs of objects. This means that we do not have to map texts to the feature vectors by explicitly representing them, as long as an efficient calculation for the inner products between a pair of texts is defined. The kernel method is widely adopted in Machine Learning methods, such as the Support Vector Machine (SVM) (Vapnik, 1995). In addition, kernel function a2a4a3a6a5a8a7a10a9a12a11 has been described as a similarity function that satisfies certain properties (Cristianini and Shawe-Taylor, 2000). The similarity measure between texts is one of the most important factors for some tasks in the application areas of NLP such as Machine Translation, Text Categorization, Information Retrieval, and Question Answering.</Paragraph> <Paragraph position="4"> This paper proposes the Hierarchical Directed Acyclic Graph (HDAG) Kernel. It can handle several of the structures found within texts and can calculate the similarity with regard to these structures at practical cost and time. The HDAG Kernel can be widely applied to learning, clustering and similarity measures in NLP tasks.</Paragraph> <Paragraph position="5"> The following sections define the HDAG Kernel and introduce an algorithm that implements it. The results of applying the HDAG Kernel to the tasks of question classification and sentence alignment are then discussed.</Paragraph> </Section> class="xml-element"></Paper>