File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/05/p05-3001_metho.xml

Size: 8,959 bytes

Last Modified: 2025-10-06 14:09:49

<?xml version="1.0" standalone="yes"?>
<Paper uid="P05-3001">
  <Title>An Information-State Approach to Collaborative Reference</Title>
  <Section position="4" start_page="1" end_page="1" type="metho">
    <SectionTitle>
3 Information State
</SectionTitle>
    <Paragraph position="0"> Our information state (IS) models the ongoing collaboration using a stack of tasks. For a task of collaborative reference, the IS tracks how interlocutors together set up and solve a constraint-satisfaction problem to identify a target object. In any state, D and M have agreed on a target variable T and a set of constraints that the value of T must satisfy. When M recognizes that these constraints identify R, the task ends successfully. Until then, D can take actions that contribute new constraints on R. Importantly, what D says adds to what is already known about R, so that the identification of R can be accomplished across multiple sentences with heterogeneous syntactic structure.</Paragraph>
    <Paragraph position="1"> Our IS also allows subtasks of questioning or clarification that interlocutors can use to maintain alignment. The same constraint-satisfaction model is used not only for referring to displayed objects but also for referring to abstract entities, such as actions or properties. Our IS tracks the salience of entity and property referents and, like Purver's, maintains the previous utterance for reference in clarification questions. Note, however, that we do not factor updates to the IS through an abstract taxonomy of speech acts. Instead, utterances directly make domain moves, such as adding a constraint, so our architecture allows utterances to trigger an open-ended range of domain-specific updates.</Paragraph>
  </Section>
  <Section position="5" start_page="1" end_page="2" type="metho">
    <SectionTitle>
4 Linguistic Representations
</SectionTitle>
    <Paragraph position="0"> The way utterances signal task contributions is through a collection of presupposed constraints. To understand an utterance, we solve the utterance's grammatically-specified semantic constraints. An interpretation is only feasible if it represents a contextually-appropriate contribution to the ongoing task. Symmetrically, to generate an utterance, we use the grammar to formulate a set of constraints; these constraints must identify the contribution the system intends to make. We view interpreted linguistic structures as representing communicative intentions; see (Stone et al., 2003) or (Stone, 2004b).</Paragraph>
    <Paragraph position="1"> As in (DeVault et al., 2004), a knowledge interface mediates between domain-general meanings and the domain-specific ontology supported in a particular application. This allows us to build inter- null pretations using domain-specific representations for referents, for task moves, and for the domain properties that characterize referents.</Paragraph>
  </Section>
  <Section position="6" start_page="2" end_page="2" type="metho">
    <SectionTitle>
5 Architecture
</SectionTitle>
    <Paragraph position="0"> Our system is implemented in Java. A set of interface types describes the flow of information and control through the architecture. The representation and reasoning outlined in Sections 3 and 4 is accomplished by implementations of these interfaces that realize our approach. Modules in the architecture exchange messages about events and their interpretations. (1) Deliberation responds to changes in the IS by proposing task moves. (2) Generation constructs collaborative intentions to accomplish the planned task moves. (3) Understanding infers collaborative intentions behind user actions. Generation and understanding share code to construct intentions for utterances, and both carry out a form of inference to the best explanation. (4) Update advances the IS symmetrically in response to intentions signaled by the system or recognized from the user; the symmetric architecture frees the designer from programming complementary updates in a symmetrical way. Additional supporting infrastructure handles the recognition of input actions, the realization of output actions, and interfacing between domain knowledge and linguistic resources.</Paragraph>
    <Paragraph position="1"> Our system is designed not just for users to interact with, but also for demonstrating and debugging the system's underlying models. Processing can be paused at any point to allow inspection of the system's representations using a range of visualization tools. You can interactively explore the IS, including the present state of the world, the agreed direction of the ongoing task, and the representation of linguistic distinctions in salience and information status. You can test the grammar and other interpretive resources. And you can visualize the search space for understanding and generation.</Paragraph>
  </Section>
  <Section position="7" start_page="2" end_page="3" type="metho">
    <SectionTitle>
6 Example
</SectionTitle>
    <Paragraph position="0"> Let us return to dialogue (1). Here the system represents its moves as successively constraining the shape, color and pattern of the target object. In generating (1c), the system iteratively elaborates its description from brown to light brown in an attempt to identify the object's color unambiguously. The user's clarification request at (1d) marks this description of color as problematic and so triggers a nested instance of the collaborative reference task.</Paragraph>
    <Paragraph position="1"> At (1e) the system adds the user's proposed constraint and (we assume) solves this nested subtask.</Paragraph>
    <Paragraph position="2"> The system returns to the main task at (1f) having grounded the color constraint and continues by identifying the pattern of the target object.</Paragraph>
    <Paragraph position="3"> Let us explore utterance (1c) in more detail. The IS records the status of the identification process.</Paragraph>
    <Paragraph position="4"> The system is the director; the user is the matcher.</Paragraph>
    <Paragraph position="5"> The target is represented provisionally by a discourse referent t1, and what has been agreed so far is that the current target is a square of the relevant sort for this task, represented in the agent as square-figure-object(t1). In addition, the system has privately recorded that square o1 is the referent it must identify. For this IS, it is expected that the director will propose an additional constraint identifying t1. The discourse state represents t1 as being in-focus, or available for pronominal reference.</Paragraph>
    <Paragraph position="6"> Deliberation now gives the generator a specific move for the system to achieve: (2) add-constraint(t1,color-sandybrown(t1)) The content of the move in (2) is that the system should update the collaborative reference task to include the constraint that the target is drawn in a particular, domain-specific color (RGB value F4-A4-60, or XHTML standard &amp;quot;sandy brown&amp;quot;). The system finds an utterance that achieves this by exploring head-first derivations in its grammar; it arrives at the derivation of it's light brown in (3).</Paragraph>
    <Paragraph position="7">  it [subject] light [color degree adverb] A set of presuppositions connect this linguistic structure to a task domain; they are given in (4a).</Paragraph>
    <Paragraph position="8"> The relevant instances in this task are shown in (4b).</Paragraph>
    <Paragraph position="9">  The utterance also uses it to describe a referent X so presupposes that in-focus(X) holds. The move effected by the utterance is schematized as M(X,C(X)). Given the range of possible task moves in the current context, the constraints specified by the grammar for (3) are modeled as determining the instantiation in (2). The system realizes the utterance and assumes, provisionally, that the utterance achieves its intended effect and records the new constraint on t1.</Paragraph>
    <Paragraph position="10"> Because the generation process incorporates entirely declarative reasoning, it is normally reversible. Normally, the interlocutor would be able to identify the speaker's intended derivation, associate it with the same semantic constraints, resolve those constraints to the intended instances, and thereby discover the intended task move. In our example, this is not what happens. Recognition of the user's clarification request is triggered as in (Purver, 2004). The system fails to interpret utterance (1d) as an appropriate move in the main reference task. As an alternative, the system &amp;quot;downdates&amp;quot; the context to record the fact that the system's intended move may be the subject of explicit grounding. This involves pushing a new collaborative reference task on the stack of ongoing activities. The system remains the director, the new target is the variable C in interpretation and the referent to be identified is the property colorsandybrown. Interpretation of (1d) now succeeds.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML