File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/04/w04-2012_metho.xml

Size: 11,804 bytes

Last Modified: 2025-10-06 14:09:18

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2012">
  <Title>Answer Validation by Keyword Association</Title>
  <Section position="4" start_page="1" end_page="1" type="metho">
    <SectionTitle>
FA(X,Y )=hits(X [?]{Y })/hits(X)
BA(X,Y )=hits(X [?]{Y })/hits({Y })
</SectionTitle>
    <Paragraph position="0"> Note that when X is fixed, FA(X,Y ) is proportional to hits(X [?]{Y }).</Paragraph>
    <Paragraph position="1"> Let's go back to Q3. In this case, the choice with the maximum BA is correct. Some questions may solved by referring to FA, while others may be solved only by referring to BA.</Paragraph>
    <Paragraph position="2"> Therefore, it is inevitable to invent a mechanism which switches between FA and BA.</Paragraph>
    <Section position="1" start_page="1" end_page="1" type="sub_section">
      <SectionTitle>
2.4 Summary
</SectionTitle>
      <Paragraph position="0"> Based on the observation of Sections 2.1 [?] 2.3, the following three questions must be addressed by answer validation based on keyword association. null * How to select appropriate keywords from a question sentence.</Paragraph>
      <Paragraph position="1"> * How to identify the correct answer considering forward and/or backward association. * How many questions can be solved by this strategy based on keyword association.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="1" end_page="2" type="metho">
    <SectionTitle>
3 Keyword Selection
</SectionTitle>
    <Paragraph position="0"> This section describes two methods for selecting appropriate keywords from a question sentence: one is based on the features of each word, the other based on hits of a search engine.</Paragraph>
    <Paragraph position="1"> First, all the nouns are extracted from the question sentence using a Japanese morphological analyzer JUMAN(Kurohashi and Nagao, 1999) and a Japanese parser KNP(Kurohashi, 1998). Here, when the sequence of nouns constitute a compound, only the longest compound is extracted and their constituent nouns are not extracted. Let N denote the set of those extracted nouns and compounds, from which key-words are selected. In the following, the search engine &amp;quot;goo  &amp;quot; is used for obtaining the number of hits.</Paragraph>
    <Section position="1" start_page="2" end_page="2" type="sub_section">
      <SectionTitle>
3.1 Keyword Selection Based on Word
Features
</SectionTitle>
      <Paragraph position="0"> In this method, keywords are selected by the following procedure:  1. If the question sentence contains n quota null tions with quotation marks ''and'', those n quoted strings are selected as keywords. null</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="2" end_page="3" type="metho">
    <SectionTitle>
2. Otherwise:
</SectionTitle>
    <Paragraph position="0"> 2-1. According to the rules for word weights in Table 4, weights are assigned to each element of the keyword candidate set N.</Paragraph>
    <Paragraph position="1"> 2-2. Select the keyword candidate with the maximum weight and that with the second maximum weight.</Paragraph>
    <Paragraph position="2"> 2-3. i. If the hits of AND search of those two keyword candidates are 15 or more, both are selected as keywords. null ii. Otherwise, select the one with the maximum weight.</Paragraph>
    <Paragraph position="3"> Let k denote the set of the selected keywords (k [?] N), we examine the correctness of k as follows. Let c denote a choice, c  Against the development set which is to be introduced in Section 6.1, the correct rate of the keywords selected by the procedure above is 84.5%.</Paragraph>
    <Paragraph position="4">  and consists of one character 0.9 marked by a topic maker and name of a job 0.1 hits &gt; 100000 0.2 hits &lt; 10000 1.1 number of characters = 1 0.2 number of characters = 2 0.25 number of characters = 3 0.5 number of characters = 4 1.1 number of characters [?] 5 1.2</Paragraph>
    <Section position="1" start_page="2" end_page="3" type="sub_section">
      <SectionTitle>
3.2 Keyword Selection Based on Hits
of Search Engine
</SectionTitle>
      <Paragraph position="0"> First, we introduce several basic methods for selecting keywords based on hits of a search engine. Let 2 N denote the power set of N,wherea set of keywords k is an element of 2</Paragraph>
      <Paragraph position="2"> k denote the selected set of keywords and ^c the selected choice.</Paragraph>
      <Paragraph position="3"> The first method is to simply select the pair of &lt; ^ k,^c&gt; which gives the maximum hits as below:</Paragraph>
      <Paragraph position="5"> Against the development set, the correct rate of the choice which is selected by this method is 35.7%.</Paragraph>
      <Paragraph position="6"> In a similar way, another method which selects the maximum FA or BA can be given as below:</Paragraph>
      <Paragraph position="8"> Their correct rates are 71.3% and 36.1%, respectively. null  Next, we introduce more sophisticated methods which use the ratio of maximum and second maximum associations such as FA or BA. The underlying assumption of those methods are that: the greater those ratios are, the more reliable is the selected choice with the maximum FA/BA. First, we introduce two methods: FA ratio and BA ratio.</Paragraph>
      <Paragraph position="9"> FAratio This is the ratio of FAof the choice with second maximum FA over one with maximum FA. FA ratio is calculated by the follow- null ing procedure.</Paragraph>
      <Paragraph position="10"> 1. Select the choices with maximum FA and second maximum FA.</Paragraph>
      <Paragraph position="11"> 2. Estimate the correctness of the choice with maximum FA by the ratio of their FAs.</Paragraph>
      <Paragraph position="13"> is defined as a function which selects c with second maximum value. Similarly, the method based on BA ratio is given as below:  Unlike the methods based on FA ratio and BA ratio, the following two methods consider both FA and BA. The motivation of those two methods is to regard the decision by FA and BA to be reliable if FA and BA agree on selecting the choice.</Paragraph>
      <Paragraph position="15"> Coverages and precisions of these four methods against the development set are shown in Table 5. Coverage is measured as the rate of questions for which the ratio is less than or equal to  . Precisions are measured as the rate of questions for which the selected choice ^c is the correct answer, over those covered questions. The method having the greatest precision is BA ratio with maximum and second maximum FA.</Paragraph>
      <Paragraph position="16"> In the following sections, we use this ratio as the keyword association ratio. Table 6 farther examines the correlation of the range of the ratio and the coverage/precision. When the ratio is less than or equal to 0.25, about 60% of the questions are solved with the precision close to 90%. This threshold of 0.25 is used in the Section 5 when integrating the keyword association ratio and word weights.</Paragraph>
    </Section>
  </Section>
  <Section position="7" start_page="3" end_page="4" type="metho">
    <SectionTitle>
4 Answer Selection
</SectionTitle>
    <Paragraph position="0"> In this section, we explain a method to identify the correct answer considering forward and/or backward association. After selecting keywords, the following numbers are obtained by a search engine.</Paragraph>
    <Paragraph position="1">  For the ratios considering both FA and BA,the ratio greater than 1 means that FAand BA disagree on selecting the choice.</Paragraph>
    <Paragraph position="2">  Then for each choice Y , FA and BA are calculated. As introduced in section 3, c  velopment set, when the keywords are selected based on word weights in Section 3.1. In the table, in addition to the result of answer selection rules above, the results with baselines of selecting the choice with maximum FA or BA are also shown. It is clear from the table that the answer selection rules described here significantly outperforms those baselines. For each of the answer selection rules, Table 8 shows its precision. In the development set  , there are 541 questions (about 60%) where  Four questions are excluded because hits of the conjunct query hits(X [?]{Y })were0  (k) are identical, and the 88.5% of the selected choices are correct. This result shows that more than half of the questions  these questions can be solved. This result shows that whether FA and BA agree or not is very important and is crucial for reliably selecting the answer.</Paragraph>
  </Section>
  <Section position="8" start_page="4" end_page="4" type="metho">
    <SectionTitle>
5 Total Procedure of Keyword
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="4" end_page="4" type="sub_section">
      <SectionTitle>
Selection and Answer Selection
</SectionTitle>
      <Paragraph position="0"> Finally, the procedures of keyword selection and answer selection presented in the previous sections are integrated as given below:  1. If ratio [?] 0.25:  Use the set of keywords selected by BA ratio with maximum and second maximum FA. The choice to be selected is the one with maximum BA.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="4" end_page="4" type="metho">
    <SectionTitle>
2. Otherwise:
</SectionTitle>
    <Paragraph position="0"> Use the set of keywords selected by word weights. Answer selection is done by the procedure of Section4.</Paragraph>
  </Section>
  <Section position="10" start_page="4" end_page="5" type="metho">
    <SectionTitle>
6 Evaluation
</SectionTitle>
    <Paragraph position="0"/>
    <Section position="1" start_page="4" end_page="5" type="sub_section">
      <SectionTitle>
6.1 Data Set
</SectionTitle>
      <Paragraph position="0"> In this research, we used the card game version of &amp;quot;(Who wants to be a millionaire)&amp;quot;, which is sold by Tomy Company, LTD. It has 1960 questions, which are classified into fifteen levels according to the amount of prize money. Each question has four choices.</Paragraph>
      <Paragraph position="1"> All questions are written in Japanese. The followings give a few examples.</Paragraph>
      <Paragraph position="2">  We divide questions of each level into two halves: first of which is used as the development set and the second as the test set. We exclude questions with superlative expressions (e.g., Out of the following four countries, select the one with the maximum number of states.) or negation (e.g., Out of the following four colors, which is not used in the national flag of France.) because they are not suitable for solving by keyword association. Consequently, the development set comprises 888 questions, while the test set comprises 906 questions. The number of questions per prize money amount is showninTable9.</Paragraph>
      <Paragraph position="3">  We compare the questions of &amp;quot;Who wants to be a millionaire&amp;quot; with those of TREC 2003 QA track and those of NTCIR4 QAC2 task. The questions of &amp;quot;Who wants to be a millionaire&amp;quot; are all classified as factoid question. They correspond to TREC 2003 QA track factoid component. The questions of NTCIR4 QAC2 are also all classified as factoid question. We compare bunsetsu  count of the questions of &amp;quot;Who wants to be a millionaire&amp;quot; with word count of the questions of TREC 2003 QA track factoid component and bunsetsu count of the questions of NTCIR4 QAC2 Subtask1. The questions of &amp;quot;Who wants to be a millionaire&amp;quot; consist of 7.24 bunsetsu on average, while those of TREC 2003 QA track factoid component consist of 7.76 words on average, and those of NTCIR4 QAC2 Subtask1 consist of 6.19 bunsetsu on average.</Paragraph>
      <Paragraph position="4"> Therefore, it can be concluded that the questions of &amp;quot;Who wants to be a millionaire&amp;quot; are not shorter than those of TREC 2003 QA track and those of NTCIR4 QAC2 task.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML