File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-2925_metho.xml

Size: 3,597 bytes

Last Modified: 2025-10-06 14:10:55

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-2925">
  <Title>Projective Dependency Parsing with Perceptron</Title>
  <Section position="5" start_page="181" end_page="182" type="metho">
    <SectionTitle>
3 Features
</SectionTitle>
    <Paragraph position="0"> The feature extraction function, ph(h,m,x,y), represents in a feature vector a dependency from word positions m to h, in the context of a sentence x and a dependency tree y. As usual in discriminative learning, we work with binary indicator features: if a certain feature is observed in an instance, the value of that feature is 1; otherwise, the value is 0. For convenience, we describe ph as a composition of several base feature extraction functions. Each extracts a number of disjoint features. The feature extraction function ph(h,m,x,y) is calculated as:</Paragraph>
    <Paragraph position="2"> where phtoken extracts context-independent token features, phtctx computes context-based token features, phdep computes context-independent depen-</Paragraph>
    <Paragraph position="4"> and context-based (phtctx). type - token type, i.e. &amp;quot;head&amp;quot; or &amp;quot;mod&amp;quot;, w - token word, l - token lemma, cp - token coarse part-of-speech (POS) tag, fp - token fine-grained POS tag, ms token morpho-syntactic feature. The * operator stands for string concatenation.</Paragraph>
    <Paragraph position="6"> (phdep) and context-based (phdctx), between two points i and j, i &lt; j. dir - dependency direction: left to right or right to left.</Paragraph>
    <Paragraph position="7"> dency features, phdctx extracts contextual dependency features, phdist calculates surface-distance features between the two tokens, and finally, phruntime computes dynamic features at runtime based on the dependencies previously built for the given interval during the bottom-up parsing. mmdh,m is a short-hand for a triple of numbers: min(h,m), max(h,m) and dh,m (a sign indicating the direction, i.e., +1 if m &lt; h, and [?]1 otherwise).</Paragraph>
    <Paragraph position="8"> We detail the token features in Table 1, the dependency features in Table 2, and the surface-distance features in Table 3. Most of these features are inspired by previous work in dependency parsing (Mc-Donald et al., 2005; Collins, 1999). What is imporphdist(x,i,j,dir) null foreach(k [?] (i,j)): dir*cp(xi)*cp(xk)*cp(xj) number of tokens between i and j number of verbs between i and j number of coordinations between i and j number of punctuations signs between i and j  meric features are discretized using &amp;quot;binning&amp;quot; to a small number of intervals.</Paragraph>
    <Paragraph position="9"> phruntime(x,y,h,m,dir) let l1,...,lS be the labels of dependencies in y that attach to h and are found from m to h.</Paragraph>
    <Paragraph position="11"> tant for the work presented here is that we construct explicit feature combinations (see above tables) because we configured our linear predictors in primal form, in order to keep training times reasonable.</Paragraph>
    <Paragraph position="12"> While the features presented in Tables 1, 2, and 3 are straightforward exploitations of the training data, the runtime features (phruntime) take a different, and to our knowledge novel in the proposed framework, approach: for a dependency from m to h, they represent the dependencies found between m and h that attach also to h. They are described in detail in Table 4. As we have noted above, these features are possible because of the parsing scheme, which scores a dependency only after all dependencies spanned by it are scored.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML