File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/w05-1625_metho.xml

Size: 19,488 bytes

Last Modified: 2025-10-06 14:09:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W05-1625">
  <Title>Answer Generation with Temporal Data Integration</Title>
  <Section position="4" start_page="0" end_page="0" type="metho">
    <SectionTitle>
2.1 Related works
</SectionTitle>
    <Paragraph position="0"> Most of existing systems on the web produce a set of answers to a question in the form of hyperlinks or page extracts, ranked according to a relevance score (for example, COGEX [Moldovan et al., 2003]). Other systems also define relationships between web page extracts or texts containing possible answers ([Harabagiu et al., 2004], [Radev et al., 1998]).</Paragraph>
    <Paragraph position="1"> For example, [Webber et al., 2002] defines 4 relationships between possible answers: a0 equivalence: equivalent answers which entail mutually, a0 inclusion: one-way entailment of answers, a0 aggregation: answers that are mutually consistent but not entailing, and that can be replaced by their conjunction, null a0 alternative: answers that are inconsistent or alternatives and that can be replaced by their disjunction.</Paragraph>
    <Paragraph position="2"> Most of question-answering systems generate answers which take into account neither information given by all candidate answers nor their inconsistency. This is the point we focus on in the following section.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
2.2 A general typology of integration mechanisms
</SectionTitle>
      <Paragraph position="0"> To better characterise our problem, we collected, via Google or QRISTAL [QRISTAL], a corpus of around 100 question-answer pairs in French that reflect different inconsistency problems. We first assume that all candidate answers are potentially correct. The corpus analysis enables us to define a general typology of relations between answers. For each relation defined in [Webber et al., 2002], we identify integration mechanisms in order to generate answers which take into account characteristics of all candidate answers.</Paragraph>
      <Paragraph position="1"> Inclusion The inclusion relation exists if a candidate answer entails another answer (for example, between concepts of candidate answers linked in an ontology by the is-a or part-of relations). For example, in Brittany and in France are correct answers to the question Where is Brest? and Brittany is a part of France. The content determination stage consists here in choosing which answer will be proposed to the user - the more specific, the more generic or all answers. This can be guided by a user model, taking into account his knowledge.</Paragraph>
      <Paragraph position="2"> Equivalence Candidate answers which are linked by an equivalence relation are consistent and entail mutually. The corpus analysis allows us to identify two main types of equivalence: (1) Lexical equivalence: synonymy, metonymy, paraphrases, proportional series, use of acronyms or foreign languages. For example, to the question Who killed John Lennon?, Mark Chapman, the murderer of John Lennon and John Lennon's killer Mark Chapman are equivalent answers.</Paragraph>
      <Paragraph position="3"> (2) Equivalence with inference: in a number of cases, some common knowledge, inferences or calculation are necessary to detect equivalence relations. For example, The A320 is 21 and The A320 has been created in 1984 are equivalent answers to the question How old is the Airbus A320?.</Paragraph>
      <Paragraph position="4"> Aggregation The aggregation relation defines a set of consistent answers when the question accepts several different ones. In this case, all candidate answers are potentially correct and can be integrated in the form of a conjunction of all these answers. For example, an answer to the question Where is Disneyland? can be in Tokyo, Paris, Hong-Kong and Los Angeles.</Paragraph>
      <Paragraph position="5"> If answers are numerical values, the integrated answer can be given in the form of an interval, average or comparison.</Paragraph>
      <Paragraph position="6"> Alternative The alternative relation defines a set of inconsistent answers. In the case of questions expecting a unique answer, only one answer among candidates is correct. On the contrary, all candidates can be correct answers.</Paragraph>
      <Paragraph position="7">  (1) A simple solution is to propose a disjunction of  candidate answers. For example, if the question When does autumn begin? has the candidate answers Autumn begin on September 21st and Autumn begins on September 20th, an answer such as Autumn begins on either September 20th or September 21st can be proposed.</Paragraph>
      <Paragraph position="8"> (2) If candidate answers have common characteristics, it is possible to integrate them according to these characteristics. For example, the question When does the French music festival take place? has the following answers June 1st 1982, June 21st 1983, ..., June 21st 2004. Here, the extraction engine selects pages containing the dates of all music festivals. These candidate answers have day and month in common.</Paragraph>
      <Paragraph position="9"> Consequently, an answer such as The French music festival takes place every June 21st can be proposed.</Paragraph>
      <Paragraph position="10"> (3) As for the aggregation relation, numerical values can be integrated in the form of an interval, average or comparison. For example, if the question How far is Paris from Toulouse? has the candidate answers 713 km, 678 km and 681 km, answers such as Paris is at about 690 km from Toulouse (average) or The distance between Paris and Toulouse is between 678 and 713 km (interval) can be proposed.</Paragraph>
      <Paragraph position="11"> In the following sections, we focus on the content determination and generation of candidate answers of type date linked by an aggregation or alternative relation, the most common ones.</Paragraph>
    </Section>
  </Section>
  <Section position="5" start_page="0" end_page="0" type="metho">
    <SectionTitle>
3 Content determination
</SectionTitle>
    <Paragraph position="0"> The problem we focus on in this section is the problem of content determination when several answers to a question of type date are selected. We consider that candidate answers can be in the form of date or temporal interval. A date is defined as a vector which allows the temporal localisation of an event. Some values of vectors can be underspecified: only relevant values for the expected information are explicit (year, hour, etc.). Then, an interval is a couple of dates, i.e. vectors defining a date of beginning and a date of end.</Paragraph>
    <Paragraph position="1"> As answers selected by the extraction engine are often in different forms (dates or intervals or both), a first step consists in standardizing data: a0 all candidate answers are in the form of an interval: this means that a date will be in the form of an interval having the same date of beginning and of end, a0 some candidate answers may be incomplete: for example, year or date of end is missing, etc. In some cases, unification with other candidate answers is possible. Otherwise, incomplete answers are omitted, a0 from the semantic point of view, all candidate answers must be in the same system of temporal reference (for example, because of possible different time zones).</Paragraph>
    <Paragraph position="2"> Once all candidate answers have been standardized, aberrant answers are filtered out by applying classical statistical methods. Then, the answer selection process can be applied.</Paragraph>
    <Section position="1" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.1 Answer selection process
</SectionTitle>
      <Paragraph position="0"> Our goal is to select, among several candidate answers, the best answer considered as the one which is the most coherent with other answers. For this purpose, we define a coherence rate of answers.</Paragraph>
      <Paragraph position="1"> Let us assume that there are N candidate answers coming from N different web pages. We consider that each candidate answer is a temporal interval a0a1a3a2a5a4a6a1a8a7a10a9 where a1a8a2 is the date of beginning and a1a3a7 the date of end of the event. Let a1a12a11a14a13a15a0a1 a2a17a16 a4a6a1 a7a18a16 a9 with a19a21a20a23a22a24a20a26a25 be these N candidate answers. null In terms of interval, we consider that the most coherent answer is the interval which intersects the greatest number of candidate intervals. For example, in Figure 1, we have 3 candidate answers a1a3a27a28a4a6a1a12a29 and a1a12a30 . They form 4 sub-intervals: a0a1a12a2a18a31a32a4a10a1a12a2a17a33a34a9 , a0a1a8a2a35a33a36a4a6a1a8a2a17a37a38a9 , a0a1a12a2a17a37a39a4a10a1a12a7a40a37a38a9 and a0a1a8a7a18a37a36a4a6a1a8a7a41a31a10a9 . The interval we consider as the most coherent is a0a1 a2 a37a36a4a6a1 a7 a37a38a9 because its occurrence frequency is 3 (i.e. the number of times it intersects the candidate answers is 3).</Paragraph>
      <Paragraph position="2"> In order to define sub-intervals, we need to have the bounds of the N candidate intervals. Let a42 a13a44a43a32a1 a2a35a45 a4a10a1 a7a46a45a39a47 , 1a20a49a48a50a20 N, be the set of ordered bounds of the N intervals and let a51 a11a53a52 a42 , 1a20a54a22a55a20 2N. Consequently, a sub-interval is in the form of a0a51 a11a41a4 a51 a11a57a56 a27 a9 .</Paragraph>
      <Paragraph position="3"> We now define a58a53a59 as the occurrence frequency of the  interval a1 a59 , i.e. the number of times a1 a59 intersects the N candidate answers:</Paragraph>
      <Paragraph position="5"> Then, the coherence rate a33 a11 assigned to each sub-interval</Paragraph>
      <Paragraph position="7"> Selecting the interval having the highest coherence rate is not sufficient. The answer must also have a relevant duration. For this purpose, we construct new intervals based on previous sub-intervals: these new ones must have a relevant duration, at least equal to the average duration of the N candidate answers. Let a1 a23a37a36a39a38 be the average duration of candidate answers.</Paragraph>
      <Paragraph position="8"> Then, we construct a coherent answer set composed of intervals satisfying a constraint duration to which we assigned a new coherence rate. This new rate is the average of the coherence rates of sub-intervals composing the new one. So, the coherent answer set a40 is defined as:</Paragraph>
      <Paragraph position="10"> Once this coherent answer set has been obtained, there is still to check if the expected answer/event is a unique or an iterative event. We consider that an event is iterative if there is a great number of intervals of a40 that are distant in time. Let a61 be the minimum time between the end of an interval and the beginning of the following one. Let a62 be the minimum number of intervals that have to be a61 distant from the others (the parameters a61 and a62 depends on data granularity). Then, an event is iterative if:</Paragraph>
      <Paragraph position="12"> At this stage, there are two possibilities:</Paragraph>
      <Paragraph position="14"> a0 or the event is iterative: there may be some temporal constraints due to the question: for example, the question expects an event in the past or in the future, an event in a particular year, etc. Let a40a74a73 be the set of intervals of a40 satisfying the question constraints. Then, a40  a22 is the set of answers/intervals (having the highest coherence rate) which can be proposed to the user:</Paragraph>
      <Paragraph position="16"> In this section, we proposed a method for content determination based on coherence rate in the case of answers of type date and in particular of type interval. In the following section, we apply this method to an example.</Paragraph>
    </Section>
    <Section position="2" start_page="0" end_page="0" type="sub_section">
      <SectionTitle>
3.2 Example
</SectionTitle>
      <Paragraph position="0"> Let us suppose that the question When did Hugo hurricane take place? is submitted to a question-answering system. The following table presents the candidate answers:</Paragraph>
      <Paragraph position="2"> Consequently, we have (cf. Figure 2):</Paragraph>
      <Paragraph position="4"> The coherence rates of each sub-interval are:</Paragraph>
      <Paragraph position="6"> The average duration of candidate answers is 5 days. Now, we construct the answer set a40 with sub-intervals having a duration between 5 and 6 days and we assign to them a new  Consequently, the intervals satisfying the average duration are: a40 a13 a43 a0a51 a27 a4 a51 a29 a0 a4a55a0a51 a27 a4 a51 a30 a9a46a4a55a0a51 a29 a4 a51 a79 a0 a4 a9 a51 a30 a4 a51 a79a12a0 a4</Paragraph>
      <Paragraph position="8"> The event is non-iterative since every interval of a40 is contiguous to the following one. So, the answer is the interval of a40 having the highest coherence rate: a40 a56 a22 a13  a25 a31 i.e. from September, 10th to 16nd 1989.</Paragraph>
    </Section>
  </Section>
  <Section position="6" start_page="0" end_page="0" type="metho">
    <SectionTitle>
4 Answer generation
</SectionTitle>
    <Paragraph position="0"> Once the most coherent answer has been elaborated, it has to be generated in natural language. Our strategy is to couple classical NLG techniques with generation templates.</Paragraph>
    <Paragraph position="1"> As our framework is the cooperative system WEBCOOP, the answer proposed to the user has to explain why this answer has been selected. The idea is to introduce possibility degrees to explain to the user how confident of the answer he can be.</Paragraph>
    <Paragraph position="2"> For this purpose, we define a certainty degree of answers which depends on several parameters: a0 the number of candidate answers ( a25 ): if a25 and the coherence rate of the selected answer are high, then this means that there were not many contradictions among candidate answers and that the answer is more certain (as a25 is already taken into account in the coherence rate, only this rate is a sufficient parameter), a0 if the difference a40 between the best coherence rate and the second best one is high, then this means that the selected answer is more certain.</Paragraph>
    <Paragraph position="3"> Consequently, we define the certainty degree a41 a11 a59 of the answer a0a51 a11a41a4 a51 a59 a9 as:  best coherence rate and a33 a29 a69 the second best one. As a16 a20 a33 a11 a59 a20 a19 and a16 a20a53a40 a20 a19 , the more a41 a11 a59 tends towards 1, the more the answer a0a51 a11 a4 a51 a59 a9 is certain. Thus, we define generation schemas for each type of answer depending on this certainty degree. We distinguish 3 main cases: (1) either a40 a56 a22 a13a55a54 , i.e. no answer has been selected. The idea is to select the candidate answer which has the highest coherence rate even if its duration is not appropriate but the generated answer has to explain that this answer is not sure, (2) or a41 a11 a59 a13 a19 , i.e. the selected answer a0a51 a11a41a4 a51 a59 a9 is certain, (3) or a41 a11 a59a1a0a13 a19 , then the generated answer has to take into account a40 . If a40 is low, the coherence rate of the selected answer is very close to other rates: in this case, several answers are potentially correct and can be proposed to the user.</Paragraph>
    <Paragraph position="4"> The idea is to generate answers with different certainty degrees depending on a41 : we choose to express this degree by the use of adverbs. For this purpose, we define a lexicalisation function lex which lexicalises the selected answers and a function lexD which lexicalises a41 . The Table 1 presents the different generation schemas (a40 is the selected answer and a40a3a2 the answer having the coherence rate the closest to a41a5a4 ). Underlined fragments are predefined texts.</Paragraph>
    <Paragraph position="5">  if a40 is a date: subject lexD(a41a6a4 , ) verb lex(A, Reg) or lex(Aa2 , Reg) if a40 is an interval: subject lexD(a41a6a4a8a7 , ) verb lex(Aa2 , Reg) but lexD(a41 a4 a7 , plus) lex(A, Reg)  Adverb intensity is represented by the following proportional serie (cf. Figure 3):  Consequently, if a41 is high, it will be lexicalised by an adverb of high intensity. The second argument of the function</Paragraph>
    <Paragraph position="7"> an adverb of lower or higher intensity than the one that would have been used normally (case (1) and (3)).</Paragraph>
    <Paragraph position="8"> The a9a11a10 a71 function has 2 arguments: the answers that have to be generated and a15 a10a6a16 indicating if the event is regular or not. Indeed, if an iterative event is regular, i.e. happens at regular intervals (i.e. the parameter a61 is always the same for all answers of a40 ), then generalisation can be made on common characteristics. For example, if a61 = 1 year, a possible generalisation is: X takes place every year on ....</Paragraph>
    <Paragraph position="9"> Example 1 To the question When was Chomsky born?, the only potential answer and its respective coherence rate is ([07-12-1928, 0712-1928], 1). Its certainty degree is: a41 a13 a19 .</Paragraph>
    <Paragraph position="10"> We are in case (2) so the generated answer is in the form: subject verb lex(A, Reg).</Paragraph>
    <Paragraph position="11"> The answer is not a regular event. Consequently, the answer in natural language is: Chomsky was born on December, 7th 1928.</Paragraph>
    <Paragraph position="12"> Example 2 To the question In which year did D. Tutu receive the Nobel Peace Prize?, the potential answers and their respective coherence rate are: (1931, 0.08), (1984, 0.87) and (1986, 0.04). The answer (1984, 0.87) is selected because it has the highest coherence rate and its certainty degree is:  We are in case (3) with a high a40 (a16a24a18a34a21a20a37a3 a16a19a18a16 a34 ) so the generated answer is in the form: subject lexD(a17a19a18 , ) verb lex(A, Reg).</Paragraph>
    <Paragraph position="13"> The answer is not a regular event and its certainty degree is high so the adverb intensity has to be high. Consequently, the answer in natural language is: D. Tutu probably received the Nobel Peace Prize in 1984. Example 3 To the question When did the American Civil War take place?, the potential answers and their respective coherence rate are: - ([01-01-1861, 09-04-1865], 0.29), - ([12-04-1861, 09-04-1865], 0.32), - ([17-04-1861, 09-04-1865], 0.33).</Paragraph>
    <Paragraph position="14"> The answer ([17-04-1861, 09-04-1865], 0.33) is selected because it has the highest coherence rate and its certainty degree is: a41 a13 a20 a16a19a18a14a11a14 a3 a16a19a18a14 a1 a31 a46 a16a24a18a14a38a14 a13a37a16a19a18a16a11a16 a14 We are in case (3) with a low a40 (a16a19a18a14a11a14 a3 a16a24a18a14 a1 ) and the answer is an interval so the generated answer is in the form: subject lexD(a17 a18 a7 , ) verb lex(Aa20, Reg) but lexD(a17 a18 a7 , plus) lex(A, Reg), with a40 a2 = [01-01-1861, 09-04-1865] (since all other answers have a quasi-similar coherence rate, a40 a2 is the interval including all the others). The answer is not a regular event and its certainty degree is very low so the adverb intensity has to be very low. Consequently, the answer in natural language is: The American Civil War possibly took place from 1861 to April, 9th 1865 but most possibly from April, 17th 1861 to April, 9th 1865.</Paragraph>
    <Paragraph position="15"> In this paper, we did not detail the lexicalisation of dates but classical lexicalisation and aggregation techniques are applied for example to group common characteristics (from September, 10th to 22th instead of from September, 10th to September, 22th, etc).</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML