XML Viewer - w06-1518

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1518_metho.xml
Size: 7,575 bytes
Last Modified: 2025-10-06 14:10:42
<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1518">
  <Title>Using LTAG-Based Features for Semantic Role Labeling</Title>
  <Section position="5" start_page="127" end_page="130" type="metho">
    <SectionTitle>
3 LTAG Based Feature Extraction
</SectionTitle>
    <Paragraph position="0"> In this section, we introduce the main components of our system. First, we do a pruning on the given parse trees with certain constraints. Then we decompose the pruned parse trees into a set of LTAG elementary trees. For each constituent in question, we extract features from its corresponding derivation tree. We train using these features in a decision list model.</Paragraph>
    <Section position="1" start_page="127" end_page="127" type="sub_section">
      <SectionTitle>
3.1 Pruning the Parse Trees
</SectionTitle>
      <Paragraph position="0"> Given a parse tree, the pruning component identifies the predicate in the tree and then only admits those nodes that are sisters to the path from the predicate to the root. It is commonly used in the SRL community (cf. (Xue and Palmer, 2004)) and our experiments show that 91% of the SRL targets can be recovered despite this aggressive pruning.</Paragraph>
      <Paragraph position="1"> There are two advantages to this pruning: the machine learning method used for prediction of SRLs is not overwhelmed with a large number of non-SRL nodes; and the process is far more efficient as 80% of the target nodes in a full parse tree are pruned away in this step. We make two enhancements to the pruned Propbank tree: we enrich the sister nodes with their head information, which is a part-of-speech tag and word pair: &lt;t,w&gt; and PP nodes are expanded to include the NP complement of the PP (including the head information). Note that the target SRL node is still the PP. Figure 1 shows the pruned parse tree for a sentence from PropBank section 24.</Paragraph>
    </Section>
    <Section position="2" start_page="127" end_page="127" type="sub_section">
      <SectionTitle>
3.2 LTAG-based Decomposition
</SectionTitle>
      <Paragraph position="0"> As next step, we decompose the pruned tree around the predicate using standard head-percolation based heuristic rules1 to convert a Treebank tree into a LTAG derivation tree. We do not use any sophistical adjunct/argument or other extraction heuristics using empty elements (as we don't have access to them in the CoNLL 2005 data). Also, we do not use any substitution nodes in our elementary trees: instead we exclusively use adjunction or sister adjunction for the attachment of sub-derivations. As a result the  root node in an LTAG derivation tree is a spinal elementary tree and the derivation tree provides the path from the predicate to the constituent in question. Figure 2 shows the resulting elementary tree after decomposition of the pruned tree. For each of the elementary trees we consider their labeling in the derivation tree to be their semantic role labels from the training data. Figure 3 is the derivation tree for the entire pruned tree.</Paragraph>
      <Paragraph position="1"> Note that the LTAG-based decomposition of the parse tree allows us to use features that are distinct from the usual parse tree path features used for SRL. For example, the typical parse tree feature from Figure 2 used to identify constituent (NP (NN terminal)) as A0 would be the parse tree fragment:</Paragraph>
      <Paragraph position="3"> VBG cover (the arrows signify the path through the parse tree). Using the LTAG-based decomposition means that our SRL model can use any features from the derivation tree such as in Figure 2, including the elementary tree shapes.</Paragraph>
    </Section>
    <Section position="3" start_page="127" end_page="130" type="sub_section">
      <SectionTitle>
3.3 Decision List Model for SRL
</SectionTitle>
      <Paragraph position="0"> Before we train or test our model, we convert the training, development and test data into LTAG derivation trees as described in the previous section. In our model we make an independence assumption that each semantic role is assigned to each constituent independently, conditional only on the path from the predicate elementary tree to the constituent elementary tree in the derivation tree. Different elementary tree siblings in the LTAG derivation tree do not influence each other in our current models. Figure 4 shows the different derivation trees for the target constituent (NP (NN terminal)): each providing a distinct semantic role labeling for a particular constituent. We use a decision list learner for identifying SRLs based on LTAG-based features. In this model, LTAG elementary trees are combined with some distance information as features to do the semantic role labeling. The rationale for using a simple DL learner is given in (Gildea and Jurafsky, 2002) where essentially it based on their experience with the setting of backoff weights for smoothing, it is stated that the most specific single feature matching the training data is enough to predict the SRL on test data. For simplicity, we only consider one intermediate elementary tree (if any) at one time instead of multiple intermediate trees along the path from the predicate to the argument.</Paragraph>
      <Paragraph position="2"> The input to the learning algorithm is labeled examples of the form (xi,yi). yi is the label (either NULL for no SRL, or the SRL) of the ith example.</Paragraph>
      <Paragraph position="3"> xi is a feature vector &lt;P,A,Dist,Position,Rtype,ti [?] tI,Distti&gt; , where P is the predicate elementary tree, A is the tree for the constituent being labeled with a SRL, tI is a set of intermediate elementary trees between the predicate tree and the argument tree. Each P,A,I tree consists of the elementary tree template plus the tag, word pair: &lt;t,w&gt; .</Paragraph>
      <Paragraph position="4"> All possible combinations of fullylexicalized/postag/un-lexicalized elementary trees are used for each example. Dist and Distti denote the distance to the predicate from the argument tree and the intermediate elementary tree respectively. Position is interpreted as the position that the target is relative to the predicate. R-type denotes the relation type of the predicate and the target constituent. 3 types are defined: if the predicate dominates (directly or undirectly) the argument in the derivation tree, we have the relation of type-1; if the other way around, the argument dominates (directly or undirectly) the predicate then we have the relation of type-2; and finally type-3 means that neither the predicate or the argument dominate each other in the derivation tree and instead are dominated (again, directly or indirectly) by another elementary tree.</Paragraph>
      <Paragraph position="5"> The output of the learning algorithm is a function h(x,y) which is an estimate of the conditional probability p(y  |x) of seeing SRL y given pattern x. h is interpreted as a decision list of rules x = y ranked by the score h(x,y). In testing, we simply pick the first rule that matches the particular test example x. We trained different models using the same learning algorithm. In addition to the LTAG-based method, we also implemented a pattern matching based method on the derived (parse) tree using the same model. In this method, instead of considering each intermediate elementary tree between the predicate and the argument, we extract the whole path from the predicate to the argument. So the input is more like a tree than a discrete feature vector. Figure 5 shows the patterns that are extracted from the same pruned tree.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML