File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/p06-1112_metho.xml

Size: 16,067 bytes

Last Modified: 2025-10-06 14:10:24

<?xml version="1.0" standalone="yes"?>
<Paper uid="P06-1112">
  <Title>Exploring Correlation of Dependency Relation Paths for Answer Extraction</Title>
  <Section position="5" start_page="890" end_page="892" type="metho">
    <SectionTitle>
3 Dependency Relation Path Correlation
</SectionTitle>
    <Paragraph position="0"> In this section, we discuss how the method performs in detail.</Paragraph>
    <Section position="1" start_page="890" end_page="891" type="sub_section">
      <SectionTitle>
3.1 Dependency Relation Path Extraction
</SectionTitle>
      <Paragraph position="0"> We parse questions and candidate sentences with MiniPar (Lin, 1994), a fast and robust parser for grammatical dependency relations. Then, we extract relation paths from dependency trees.</Paragraph>
      <Paragraph position="1"> Dependency relation path is defined as a structure P =&lt; N1,R,N2 &gt; where, N1, N2 are two phrases and R is a relation sequence R =&lt; r1,...,ri &gt; in which ri is one of the predefined dependency relations. Totally, there are 42 relations defined in MiniPar. A relation sequence R between two phrases N1, N2 is extracted by traversing from the N1 node to the N2 node in a dependency tree.</Paragraph>
      <Paragraph position="2"> Q: What book did Rachel Carson write in 1962?Paths for Answer Ranking  sentence. EAP indicates expected answer position; CA indicates candidate answer For each question, we extract relation paths among noun phrases, main verb and question word. The question word is further replaced with &amp;quot;EAP&amp;quot;, which indicates the expected answer position. For each candidate sentence, we firstly extract relation paths between answer candidates and mapped question phrases. These paths will be used for answer ranking (Section 4). Secondly, we extract relation paths among mapped question phrases. These paths will be used for answer re-ranking (Section 5). Question phrase mapping will be discussed in Section 3.4. Figure 1 shows some relation paths extracted for an example question and candidate sentence.</Paragraph>
      <Paragraph position="3"> Next, the relation paths in a question and each of its candidate sentences are paired according to their phrase similarity. For any two relation path Pi and Pj which are extracted from the question and the candidate sentence respectively, if Sim(Ni1,Nj1) &gt; 0 and Sim(Ni2,Nj2) &gt; 0, Pi and Pj are paired as &lt; Pi,Pj &gt;. The question phrase &amp;quot;EAP&amp;quot; is mapped to candidate answer phrase in the sentence. The similarity between two  phrases will be discussed in Section 3.4. Figure 2 further shows the paired relation paths which are presented in Figure 1.</Paragraph>
    </Section>
    <Section position="2" start_page="891" end_page="891" type="sub_section">
      <SectionTitle>
3.2 Dependency Relation Path Correlation
</SectionTitle>
      <Paragraph position="0"> Comparing a proper answer and other wrong candidate answers in each sentence, we assume that relation paths between the proper answer and question phrases in the sentence are more correlated to the corresponding paths in question. So, for each path pair &lt; P1,P2 &gt;, we measure the correlation between its two paths P1 and P2.</Paragraph>
      <Paragraph position="1"> We derive the correlations between paths by adapting dynamic time warping (DTW) algorithm (Rabiner et al., 1978). DTW is to find an optimal alignment between two sequences which maximizes the accumulated correlation between two sequences. A sketch of the adapted algorithm is as follows.</Paragraph>
      <Paragraph position="2"> Let R1 =&lt; r11,...,r1n &gt;,(n = 1,...,N) and R2 =&lt; r21,...,r2m &gt;,(m = 1,...,M) denote two relation sequences. R1 and R2 consist of N and M relations respectively. R1(n) = r1n and R2(m) = r2m. Cor(r1,r2) denotes the correlation between two individual relations r1, r2, which is estimated by a statistical model during training (Section 3.3). Given the correlations Cor(r1n,r2m) for each pair of relations (r1n,r2m) within R1 and R2, the goal of DTW is to find a path, m = map(n), which map n onto the corresponding m such that the accumulated correlation Cor[?] along the path is maximized.</Paragraph>
      <Paragraph position="3">  termine the optimum path map(n). The accumulated correlation CorA to any grid point (n,m) can be recursively calculated as</Paragraph>
      <Paragraph position="5"> The overall correlation measure has to be normalized as longer sequences normally give higher correlation value. So, the correlation between two sequences R1 and R2 is calculated as</Paragraph>
      <Paragraph position="7"> Finally, we define the correlation between two relation paths P1 and P2 as</Paragraph>
      <Paragraph position="9"> are the phrase mapping score when pairing two paths, which will be described in Section 3.4. If two phrases are absolutely different</Paragraph>
      <Paragraph position="11"> paths may not be paired since Cor(P1,P2) = 0.</Paragraph>
    </Section>
    <Section position="3" start_page="891" end_page="892" type="sub_section">
      <SectionTitle>
3.3 Relation Correlation Estimation
</SectionTitle>
      <Paragraph position="0"> In the above section, we have described how to measure path correlations. The measure requires relation correlations Cor(r1,r2) as inputs. We apply a statistical method to estimate the relation correlations from a set of training path pairs. The training data collecting will be described in Section 6.1.</Paragraph>
      <Paragraph position="1"> For each question and its answer sentences in training data, we extract relation paths between &amp;quot;EAP&amp;quot; and other phrases in the question and paths between proper answer and mapped question phrases in the sentences. After pairing the question paths and the corresponding sentence paths, correlation of two relations is measured by their bipartite co-occurrence in all training path pairs. Mutual information-based measure (Cui et al., 2004) is employed to calculate the relation correlations. null</Paragraph>
      <Paragraph position="3"> where, rQi and rSj are two relations in question paths and sentence paths respectively. fQ(rQi ) and fS(rSj ) are the numbers of occurrences of rQi in question paths and rSj in sentence paths respectively. d(rQi ,rSj ) is 1 when rQi and rSj co-occur in a path pair, and 0 otherwise. a is a factor to discount the co-occurrence value for long paths. It is set to the inverse proportion of the sum of path lengths of the path pair.</Paragraph>
    </Section>
    <Section position="4" start_page="892" end_page="892" type="sub_section">
      <SectionTitle>
3.4 Approximate Question Phrase Mapping
</SectionTitle>
      <Paragraph position="0"> Basic noun phrases (BNP) and verbs in questions are mapped to their candidate sentences. A BNP is defined as the smallest noun phrase in which there are no noun phrases embedded. To address lexical and format variations between phrases, we propose an approximate phrase mapping strategy.</Paragraph>
      <Paragraph position="1"> A BNP is separated into a set of heads H = {h1,...,hi} and a set of modifiers M = {m1,...mj}. Some heuristic rules are applied to judge heads and modifiers: 1. If BNP is a named entity, all words are heads. 2. The last word of BNP is head. 3. Rest words are modifiers.</Paragraph>
      <Paragraph position="2"> The similarity between two BNPs  These items consider morphological, format and semantic variations respectively. 1. The morphological variations match words after stemming, such as &amp;quot;Rhodes scholars&amp;quot; and &amp;quot;Rhodes scholarships&amp;quot;. 2. The format alternations cope with special characters, such as &amp;quot;-&amp;quot; for &amp;quot;Ice-T&amp;quot; and &amp;quot;Ice T&amp;quot;, &amp;quot;&amp;&amp;quot; for &amp;quot;Abercrombie and Fitch&amp;quot; and &amp;quot;Abercrombie &amp; Fitch&amp;quot;. 3. The semantic similarity SemSim(hi,hj) is measured using Word-Net and eXtended WordNet. We use the same semantic path finding algorithm, relation weights and semantic similarity measure as (Moldovan and Novischi, 2002). For efficiency, only hypernym, hyponym and entailment relations are considered and search depth is set to 2 in our experiments.</Paragraph>
      <Paragraph position="3"> Particularly, the semantic variations are not considered for NE heads and modifiers. Modifier similarity Sim(mi,mj) only consider the morphological and format variations. Moreover, verb similarity measure Sim(v1,v2) is the same as head similarity measure Sim(hi,hj).</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="892" end_page="893" type="metho">
    <SectionTitle>
4 Candidate Answer Ranking
</SectionTitle>
    <Paragraph position="0"> According to path correlations of candidate answers, a Maximum Entropy (ME)-based model is applied to rank candidate answers. Unlike (Cui et al., 2004), who rank candidate answers with the sum of the path correlations, ME model may estimate the optimal weights of the paths based on a training data set. (Berger et al., 1996) gave a good description of ME model. The model we use is similar to (Shen et al., 2005; Ravichandran et al., 2003), which regard answer extraction as a ranking problem instead of a classification problem. We apply Generalized Iterative Scaling for model parameter estimation and Gaussian Prior for smoothing.</Paragraph>
    <Paragraph position="1"> If expected answer type is unknown during question processing or corresponding type of named entities isn't recognized in candidate sentences, we regard all basic noun phrases as candidate answers. Since a MUC-based NER loses many types of named entities, we have to handle larger candidate answer sets. Orthographic features, similar to (Shen et al., 2005), are extracted to capture word format information of candidate answers, such as capitalizations, digits and lengths, etc. We expect they may help to judge what proper answers look like since most NER systems work on these features.</Paragraph>
    <Paragraph position="2"> Next, we will discuss how to incorporate path correlations. Two facts are considered to affect path weights: question phrase type and path length. For each question, we divide question phrases into four types: target, topic, constraint and verb. Target is a kind of word which indicates the expected answer type of the question, such as &amp;quot;party&amp;quot; in &amp;quot;What party led Australia from 1983 to 1996?&amp;quot;. Topic is the event/person that the question talks about, such as &amp;quot;Australia&amp;quot;. Intuitively, it is the most important phrase of the question. Constraint are the other phrases of the question except topic, such as &amp;quot;1983&amp;quot; and &amp;quot;1996&amp;quot;. Verb is the main verb of the question, such as &amp;quot;lead&amp;quot;. Furthermore, since shorter path indicates closer relation between two phrases, we discount path correlation in long question path by dividing the correlation by the length of the question path. Lastly, we sum the discounted path correlations for each type of question phrases and fire it as a feature, such as &amp;quot;Target Cor=c, where c is the correlation value for question target. ME-based ranking model incorporate the orthographic and path  correlation features to rank candidate answers for each of candidate sentences.</Paragraph>
  </Section>
  <Section position="7" start_page="893" end_page="893" type="metho">
    <SectionTitle>
5 Candidate Answer Re-ranking
</SectionTitle>
    <Paragraph position="0"> After ranking candidate answers, we select the highest ranked one from each candidate sentence.</Paragraph>
    <Paragraph position="1"> In this section, we are to re-rank them according to sentence supportive degree. We assume that a candidate sentence supports an answer if relations between mapped question phrases in the candidate sentence are similar to the corresponding ones in question. Relation paths between any two question phrases are extracted and paired. Then, correlation of each pair is calculated. Re-rank formula is defined as follows:</Paragraph>
    <Paragraph position="3"> where, a is answer ranking score. It is the normalized prediction value of the ME-based ranking model described in Section 4. summationtext</Paragraph>
    <Paragraph position="5"> the sum of correlations of all path pairs. Finally, the answer with the highest score is returned.</Paragraph>
  </Section>
  <Section position="8" start_page="893" end_page="894" type="metho">
    <SectionTitle>
6 Experiments
</SectionTitle>
    <Paragraph position="0"> In this section, we set up experiments on TREC factoid questions and report evaluation results.</Paragraph>
    <Section position="1" start_page="893" end_page="894" type="sub_section">
      <SectionTitle>
6.1 Experiment Setup
</SectionTitle>
      <Paragraph position="0"> The goal of answer extraction is to identify exact answers from given candidate sentence collections for questions. The candidate sentences are regarded as the most relevant sentences to the questions and retrieved by IR techniques. Qualities of the candidate sentences have a strong impact on answer extraction. It is meaningless to evaluate the questions of which none candidate sentences contain proper answer in answer extraction experiment. To our knowledge, most of current QA systems lose about half of questions in sentence retrieval stage. To make more questions evaluated in our experiments, for each of questions, we automatically build a candidate sentence set from TREC judgements rather than use sentence retrieval output.</Paragraph>
      <Paragraph position="1"> We use TREC99-03 questions for training and TREC04 questions for testing. As to build training data, we retrieve all of the sentences which contain proper answers from relevant documents according to TREC judgements and answer patterns.</Paragraph>
      <Paragraph position="2"> Then, We manually check the sentences and remove those in which answers cannot be supported.</Paragraph>
      <Paragraph position="3"> As to build candidate sentence sets for testing, we retrieve all of the sentences from relevant documents in judgements and keep those which contain at least one question key word. Therefore, each question has at least one proper candidate sentence which contains proper answer in its candidate sentence set.</Paragraph>
      <Paragraph position="4"> There are 230 factoid questions (27 NIL questions) in TREC04. NIL questions are excluded from our test set because TREC doesn't supply relevant documents and answer patterns for them.</Paragraph>
      <Paragraph position="5"> Therefore, we will evaluate 203 TREC04 questions. Five answer extraction methods are evaluated for comparison: * Density: Density-based method is used as baseline, in which we choose candidate answer with the shortest surface distance to question phrases.</Paragraph>
      <Paragraph position="6"> * SynPattern: Syntactic relation patterns (Shen et al., 2005) are automatically extracted from training set and are partially matched using tree kernel.</Paragraph>
      <Paragraph position="7"> * StrictMatch: Strict relation matching follows the assumption in (Tanev et al., 2004; Wu et al., 2005). We implement it by adapting relation correlation score. In stead of learning relation correlations during training, we predefine them as: Cor(r1,r2) = 1 if</Paragraph>
      <Paragraph position="9"> * ApprMatch: Approximate relation matching (Cui et al., 2004) aligns two relation paths using fuzzy matching and ranks candidates according to the sum of all path similarities.</Paragraph>
      <Paragraph position="10"> * CorME: It is the method proposed in this paper. Different from ApprMatch, ME-based ranking model is implemented to incorporate path correlations which assigns different weights for different paths respectively. Furthermore, phrase mapping score is incorporated into the path correlation measure.</Paragraph>
      <Paragraph position="11"> These methods are briefly described in Section 2. Performance is evaluated with Mean Reciprocal Rank (MRR). Furthermore, we list percentages of questions correctly answered in terms of top 5 answers and top 1 answer returned respectively. No answer validations are used to adjust answers.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML