File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/99/e99-1025_intro.xml

Size: 3,658 bytes

Last Modified: 2025-10-06 14:06:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="E99-1025">
  <Title>New Models for Improving Supertag Disambiguation</Title>
  <Section position="3" start_page="0" end_page="188" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Many natural language applications are beginning to exploit some underlying structure of the language. Roukos (1996) and Jurafsky et al. (1995) use structure-based language models in the context of speech applications. Grishman (1995) and Hobbs et al. (1995) use phrasal information in information extraction. Alshawi (1996) uses dependency information in a machine translation system. The need to impose structure leads to the need to have robust parsers. There have been two main robust parsing paradigms: Finite State Grammar-based approaches (such as Abney (1990), Grishman (1995), and Hobbs et al. (1997)) and Statistical Parsing (such as Charniak (1996), Magerman (1995), and Collins (1996)).</Paragraph>
    <Paragraph position="1"> Srinivas (1997a) has presented a different approach called supertagging that integrates linguistically motivated lexical descriptions with the robustness of statistical techniques. The idea underlying the approach is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (Supertags) that impose complex constraints in a local context. Supertag disambiguation is resolved &amp;quot;Supported by NSF grants ~SBR-9710411 and ~GER-9354869 by using statistical distributions of supertag co-occurrences collected from a corpus of parses. It results in a representation that is effectively a parse (almost parse).</Paragraph>
    <Paragraph position="2"> Supertagging has been found useful for a number of applications. For instance, it can be used to speed up conventional chart parsers because it reduces the ambiguity which a parser must face, as described in Srinivas (1997a).</Paragraph>
    <Paragraph position="3"> Chandrasekhar and Srinivas (1997) has shown that supertagging may be employed in information retrieval. Furthermore, given a sentence aligned parallel corpus of two languages and almost parse information for the sentences of one of the languages, one can rapidly develop a grammar for the other language using supertagging, as suggested by Bangalore (1998).</Paragraph>
    <Paragraph position="4"> In contrast to the aforementioned work in supertag disambiguation, where the objective was to provide a-direct comparison between trigram models for part-of-speech tagging and supertagging, in this paper our goal is to improve the performance of supertagging using local techniques which avoid full parsing. These supertag disambiguation models can be grouped into contextual models and class based models. Contextual models use different features in frameworks that exploit the information those features provide in order to achieve higher accuracies in supertagging. For class based models, supertags are first grouped into clusters and words are tagged with clusters of supertags. We develop several automated clustering techniques. We then demonstrate that with a slight increase in supertag ambiguity that supertagging accuracy can be substantially improved.</Paragraph>
    <Paragraph position="5"> The layout of the paper is as follows. In Section 2, we briefly review the task of supertagging and the results from previous work. In Section 3, we explore contextual models. In Section 4, we outline various class based approaches. Ideas for future work are presented in Section 5. Lastly, we  v Proceedings of EACL '99 present our conclusions in Section 6.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML