File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/05/w05-1516_evalu.xml

Size: 3,660 bytes

Last Modified: 2025-10-06 13:59:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1516">
  <Title>Strictly Lexical Dependency Parsing</Title>
  <Section position="7" start_page="157" end_page="157" type="evalu">
    <SectionTitle>
5 Related Work
</SectionTitle>
    <Paragraph position="0"> Previous parsing models (e.g., Collins, 1997; Charniak, 2000) maximize the joint probability P(S, T) of a sentence S and its parse tree T. We maximize the conditional probability P(T  |S). Although they are theoretically equivalent, the use of conditional model allows us to take advantage of similarity-based smoothing.</Paragraph>
    <Paragraph position="1"> Clark et al. (2002) also computes a conditional probability of dependency structures. While the probability space in our model consists of all possible non-projective dependency trees, their probability space is constrained to all the dependency structures that are allowed by a Combinatorial Category Grammar (CCG) and a category dictionary (lexicon). They therefore do not need the STOP markers in their model. Another major difference between our model and (Clark et al., 2002) is that the parameters in our model consist exclusively of conditional probabilities of binary variables.</Paragraph>
    <Paragraph position="2"> Ratnaparkhi's maximum entropy model (Ratnaparkhi, 1999) is also a conditional model. However, his model maximizes the probability of the action during each step of the parsing process, instead of overall quality of the parse tree.</Paragraph>
    <Paragraph position="3"> Yamada and Matsumoto (2002) presented a dependency parsing model using support vector machines. Their model is a discriminative model that maximizes the differences between scores of the correct parse and the scores of the top competing incorrect parses.</Paragraph>
    <Paragraph position="4"> In many dependency parsing models such as (Eisner, 1996) and (MacDonald et al., 2005), the score of a dependency tree is the sum of the scores of the dependency links, which are computed independently of other links. An undesirable consequence of this is that the parser often creates multiple dependency links that are separately likely but jointly improbable (or even impossible).</Paragraph>
    <Paragraph position="5"> For example, there is nothing in such models to prevent the parser from assigning two subjects to a verb. In the DMV model (Klein and Manning, 2004), the probability of a dependency link is partly conditioned on whether or not there is a head word of the link already has a modifier. Our model is quite similar to the DMV model, except that we compute the conditional probability of the parse tree given the sentence, instead of the joint probability of the parse tree and the sentence.</Paragraph>
    <Paragraph position="6"> There have been several previous approaches to parsing Chinese with the Penn Chinese Treebank (e.g., Bikel and Chiang, 2000; Levy and Manning, 2003). Both of these approaches employed phrase-structure joint models and used part-of-speech tags in back-off smoothing. Their results were evaluated with the precision and recall of the bracketings implied in the phrase structure parse trees. In contrast, the accuracy of our model is measured in terms of the dependency relationships. A dependency tree may correspond to more than one constituency trees. Our results are therefore not directly comparable with the precision and recall values in previous research. Moreover, it was argued in (Lin 1995) that dependency based evaluation is much more meaningful for the applications that use parse trees, since the semantic relationships are generally embedded in the dependency relationships.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML