File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/04/w04-2508_evalu.xml
Size: 8,706 bytes
Last Modified: 2025-10-06 13:59:20
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2508"> <Title>Experiments with Interactive Question Answering in Complex Scenarios</Title> <Section position="14" start_page="0" end_page="0" type="evalu"> <SectionTitle> 4.2 Results Produced by Experts </SectionTitle> <Paragraph position="0"> In this section, we present work from a pilot study that examined how intelligence analysts performed question decompositions for domains within their areas of expertise.</Paragraph> <Paragraph position="1"> After presenting a case study comparing their individual decompositions of a domain, we identify five different decomposition strategies employed by the analysts.</Paragraph> <Paragraph position="2"> In order to obtain more high-quality data for analysis, we invited three intelligence analysts from the Naval Reserve to LCC for three days of study. We were interested in (1) determining whether users of a specific level of expertise performed decompositions of complex questions in a similar fashion and (2) identifying possible patterns in their research styles that could be used in the development of automatic question decomposition strategies.</Paragraph> <Paragraph position="3"> Analysts participated in three tasks. For the first task, analysts were asked to create short outlines (dubbed &quot;skeleton reports&quot;) of answers to complex questions using only publicly-available web-based resources. For the second and third tasks, analysts were asked to provide decompositions of complex questions. In the second task, analysts were asked to list the questions that they anticipated they would answer prior to starting their research; in the third task, analysts decomposed questions without any other special instructions. For purposes of comparison, a LCC developer participated also participated in the decomposition tasks.</Paragraph> <Paragraph position="4"> Despite their similar levels of training and expertise, the differences in the analysts' individual styles were striking. When asked to decompose the &quot;Iraqi bioweapons&quot; scenario presented in Figures 5 , 6, and 7, analysts produced questions that demonstrated broad differences in their interpretation of the scenario itself.</Paragraph> <Paragraph position="5"> Analyst 1. Analyst 1's decomposition focused on four specific aspects of Iraq's bioweapons program: (1) the history of the bioweapons program, (2) the alleged products of the program, (3) the personnel involved with creation of the program, and (4) the potential locations for program. Although this analyst's 10 questions were wellbalanced, his decomposition centered on the nature of the program itself, and provided no potential for an explanation of why the weapons were difficult to find.</Paragraph> <Paragraph position="6"> Analyst 2. Analyst 2's decomposition questioned many of the implicit assumptions set forth in the topic question itself. Instead of providing subquestions that could have led to potential answers for this topic question, his decomposition suggested that he had rejected the propositions that the topic question was based on. In his 18 subquestions, he questioned the presuppositions of the scenario itself, generating subquestions such as Is it the case that the UN inspectors are really being denied &quot;complete access&quot;? and Can we be sure that Iraq had bioweapons at any point in the past?.</Paragraph> <Paragraph position="7"> Analyst 3. Analyst 3's decomposition focused on the reasons he believed were responsible for the difficulty in finding Iraq's bioweapons. In his 17 questions, he discussed three real hypotheses: (1) the weapons do not exist, (2) the weapons are well hidden in Iraq, (3) the weapons have been moved outside of Iraq. After identifying these three hypotheses, Analyst 3 asked a variety of subquestions that gathered evidence for (or against) each of these three possibilities.</Paragraph> <Paragraph position="8"> Although the analysts produced roughly similar numbers of subquestions, there was little overlap in the content that they covered. This suggests that even expert analysts differ markedly in their expectations of what constitutes the informational goal of a complex information-seeking scenario.</Paragraph> <Paragraph position="9"> In examining the decompositions produced by the analysts, we discovered that the analysts employed a number of distinct decomposition strategies. Four of them are discussed below: Syntactic Decomposition. Analysts split questions that featured syntactic coordination into subquestions that contained each of the individual conjuncts. For example, a question like How do we know that the UN has not found any biological or chemical weapons? was decomposed as How do we know that the UN has not found any biological weapons? and How do we know that the UN has not found any chemical weapons? In the data we collected, we only found examples of adjective phrase and noun phrase conjunction; we expect analysts to decompose examples of sentence or verb phrase coordination in a similar fashion.</Paragraph> <Paragraph position="10"> Entity Motivations. Analysts asked questions about an entity's political or economic motives if the topic question involved a predicate that implied that the entity had volitional control over its actions. For example, a topic question like Why does China dispute Taiwan's independence? was decomposed into questions like What are China's economic motives for disputing Taiwan's independence? or What are China's political motives for disputing Taiwan's independence?.</Paragraph> <Paragraph position="11"> State Discovery. When faced with a question about the existence of a property or past state, analysts generated decompositions that contrasted the previous status and its current status. For example, questions like What type of nuclear assistance did China give to the Middle East between 1980 and 1990? were routinely decomposed into questions of the form How does the nuclear assistance given by China to the Middle East from 1980 to 1990 compare to nuclear assistance it provides to the Middle East today?.</Paragraph> <Paragraph position="12"> These subquestions took three forms. Analysts wanted to know: (1) how the situation in the past differs from the present situation, (2) what caused the change from the past to the present, and (3) what impact the past events have on the present. We hypothesize that the above sub-questions are part of a larger class of subquestions known as state discovery questions. Unlike events, which represent a particular moment in time (or set of moments), states are inherently durative and therefore are subject to a wider variety of changes in scope, level, or status over time. We believe that questions that make reference to a property or a state of being necessarily make an implicit comparison between periods in time: i.e. an identified point (such as the years between 1980 - 1990) and some other reference point (either the current moment or some other salient period).</Paragraph> <Paragraph position="13"> Meronymy. We found that analysts were sensitive to the internal structure of many of the named entities referenced in topic questions. In general, analysts generated questions about the subparts of an entity if and only if information about those subparts proved informative in answering the topic question as a whole. Given an example like Where are Prithvi missiles manufactured?, analysts generated decompositions like Where are the guidance systems for Prithvi missiles manufactured? or Where are the warheads for Prithvi missiles manufactured?. Further research is needed to determine when an entity's component parts should be considered as part of the informational goal of a question.</Paragraph> <Paragraph position="14"> When faced with topic questions that present a controversial or empirically-unverified proposition, we found that analysts generated decompositions that questioned the relative truth of the proposition before generating other decompositions. Faced with a complex question like How much nuclear material was stolen from the Soviet military after the fall of the Soviet Union? (which necessarily entails that nuclear material was, in fact, stolen from the Soviet military), analysts generated decompositions like How much nuclear material is known to have been stolen from the Soviet military? or How much nuclear material is suspected to have been stolen from the Soviet military? Analysts did not generate these types of questions, however, when the question under discussion reflected a publicly accepted proposition or an empirically-verifiable state.</Paragraph> </Section> class="xml-element"></Paper>