File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-0915_intro.xml

Size: 4,737 bytes

Last Modified: 2025-10-06 14:02:32

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-0915">
  <Title>Interpreting Communicative Goals in Constrained Domains using Generation and Interactive Negotiation</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> A classical view on text interpretation is to have a syntactic parsing process followed by semantic interpretation derived from syntactic structures (Allen, 1995). In practice, however, building broad-coverage syntactically-driven parsing grammars that are robust to the variation in the input is a very difficult task. Sometimes, it may not be relevant to perform a fine-grained analysis of the semantic content of text. Indeed, there are cases where what should be recognized is the high-level communicative intentions of the author. Depending on the kind of interpretation that is targeted  rope (XRCE) and GETA when this work was carried out under a PhD grant.</Paragraph>
    <Paragraph position="1"> from a text, some semantic distinctions need not be recognized. For example, the two following sentences found in a drug leaflet may not carry significantly different communicative goals in spite of their clear semantic differences: + Consult your doctor in case of pregnancy before taking this product.</Paragraph>
    <Paragraph position="2"> + Consult a health professional in case of pregnancy before taking this product.</Paragraph>
    <Paragraph position="3"> We have identified a domain of application, document normalization, where text interpretation can be limited in many cases to the interpretation of a text in terms of the communicative goals it conveys (Max, 2003a). We have defined document normalization as the process that first derives the normalized communicative content of a text in a constrained domain (e.g. drug leaflets), and then generates the normalized version of the text in the language of the original document. We considered three levels in a normalization model for documents in constrained domain: 1. Communicative goals: the communicative goals that can appear in a document in constrained domain belong to a predefined repertoire. null 2. Communicative structure: the communicative structure describes the content of a document in terms of compatible communicative goals, as well as how these communicative goals are organized in a document.</Paragraph>
    <Paragraph position="4"> 3. Natural language: the language used should be as comprehensible as possible. To this end, every communicative goal should be associated with an expression that could be considered as &amp;quot;gold standard&amp;quot;.</Paragraph>
    <Paragraph position="5"> Figure 1 shows a warning section found in the drug leaflet for a pain reducer. Manually deriving a normalized version of this document extract using a normalization model requires identifying the communicative goals present in the document, which may be deduced from textual evidence found at different places in the document. Once identified, these communicative goals must be compared with the normalized ones in the pre-defined repertoire. We consider the four following cases:  1. A communicative goal in the document is clearly identified as belonging to the predefined repertoire.</Paragraph>
    <Paragraph position="6"> 2. A communicative goal in the document belongs to the predefined repertoire, but several normalized communicative goals are in competition due to some evidence found in the document.</Paragraph>
    <Paragraph position="7"> 3. A communicative goal in the document does not belong to the predefined repertoire, but it is deemed close to a normalized communicative goal.</Paragraph>
    <Paragraph position="8"> 4. A communicative goal in the document can null not be matched with any normalized communicative goal.</Paragraph>
    <Paragraph position="9"> Once the normalized communicative goals have been identified, the communicative structure can be built (provided there are no incompatibilities) and the corresponding normalized textual version produced. A possible normalized text corresponding to the input document of figure 1 is given on figure 2.</Paragraph>
    <Paragraph position="10"> The very general Warnings section has been split into several subsections. Communicative goals that were expressed in the same sentence have been isolated and reformulated in separate sentences, as is the case for the communicative goal indicating that the product should not be taken in case of allergy to aspirin. This communicative goal was found in a complex sentence, Do not take this product if you have asthma, an allergy to aspirin, stomach problems. . . , and was reformulated as DO NOT TAKE THIS DRUG IF</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML