File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/05/h05-1064_relat.xml

Size: 3,123 bytes

Last Modified: 2025-10-06 14:15:44

<?xml version="1.0" standalone="yes"?>
<Paper uid="H05-1064">
  <Title>Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), pages 507-514, Vancouver, October 2005. c(c)2005 Association for Computational Linguistics Hidden-Variable Models for Discriminative Reranking</Title>
  <Section position="3" start_page="507" end_page="507" type="relat">
    <SectionTitle>
2 Related Work
</SectionTitle>
    <Paragraph position="0"> Various machine-learning methods have been used within reranking tasks, including conditional log-linear models (Ratnaparkhi et al., 1994; Johnson et al., 1999), boosting methods (Collins, 2000), variants of the perceptron algorithm (Collins, 2002; Shen et al., 2004), and generalizations of support-vector machines (Shen and Joshi, 2003). There have been several previous approaches to parsing using log-linear models and hidden variables. Riezler et al. (2002) describe a discriminative LFG parsing model that is trained on standard (syntax only) treebank annotations by treating each tree as a full LFG analysis with an observed c-structure and hidden f-structure. Clark and Curran (2004) present an alternative CCG parsing approach that divides each CCG parse into a dependency structure (observed) and a derivation (hidden). More recently, Matsuzaki et al. (2005) introduce a probabilistic CFG augmented with hidden information at each nonterminal, which gives their model the ability to tailor itself to the task at hand. The form of our model is closely related to that of Quattoni et al. (2005), who describe a hidden-variable model for object recognition in computer vision.</Paragraph>
    <Paragraph position="1"> The approaches of Riezler et al., Clark and Curran, and Matsuzaki et al. are similar to our own work in that the hidden variables are exponential in number and must be handled with dynamic-programming techniques. However, they differ from our approach in the definition of the hidden variables (the Matsuzaki et al. model is the most similar). In addition, these three approaches don't use reranking, so their features must be restricted to local scope in order to allow dynamic-programming approaches to training. Finally, these approaches use Viterbi or other approximations during decoding, something our model can avoid (see section 6.2).</Paragraph>
    <Paragraph position="2"> In some instantiations, our model effectively clusters words into categories. Our approach differs from standard word clustering in that the clustering criteria is directly linked to the reranking objective, whereas previous word-clustering approaches (e.g. Brown et al. (1992) or Pereira et al. (1993)) have typically leveraged distributional similarity. In other instantiations, our model establishes word-sense distinctions. Bikel (2000) has done previous work on incorporating the WordNet hierarchy into a generative parsing model; however, this approach requires data with word-sense annotations whereas our model deals with word-sense ambiguity through unsupervised discriminative training.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML