File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/99/w99-0312_abstr.xml

Size: 21,505 bytes

Last Modified: 2025-10-06 13:49:50

<?xml version="1.0" standalone="yes"?>
<Paper uid="W99-0312">
  <Title>A mark up language for tagging discourse and annotating documents in context sensitive interpretation environments</Title>
  <Section position="2" start_page="0" end_page="99" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> A mark up language for tagging discourse, and for converting discourse sequences into a written format, according to highly context sensitive procedures, will be illustrated as well as a system for document annotation.</Paragraph>
    <Paragraph position="1"> The context in which a given communicative intercourse has taken place needs to be made available to ensure consistent interpretation of both single sequences of discourse and global concepts that are carried along during the dialogue or conversation. Full vis~ility of different communicative intentions that reflect the evolution of conversations in time and space, as well as access to various modes of communicative actions and changing conditions of interpretation, is relevant and necessary, especially in contexts where interaction is based upon asynchronous communication. Interpretation ~hif~s, that each sequence of a dialogue is likely to undergo, may create distortions in the interpretation of the overall intercourse and communicative action leading to the creation of the final document. Some contextual concretions, which may be based upon false assumptions, are particularly powerful Persistent interpretative links may be evoked and activated at any time, even if unintentionally. Special attention needs to be paid to ensure that undesirable links are not established and unintentional contextual concretions are not added. Use of a consistently applied, commonly shared conceptual tool for assit~ning context-sensitive interpretative values to each sequence of discourse holds great promise for avoidance of this problem Appropriately packaged documents and parts of documents could carry along their own originating context of discourse in the form of attached information. Accurate illustrations using a fidly functional set of tools I have developed are provided here to show how fi~zziness and misinterpretation (caused by an absence of consistent interpretative clues about originating contextual conditions for discourse and conversations) may be significantly reduced or even eliminated.</Paragraph>
    <Paragraph position="2"> Introduction A document comes ~om &amp;quot;somewhere in time and space and leads toward somewhere else&amp;quot;(Tonfoni, 1998).</Paragraph>
    <Paragraph position="3"> It may therefore be defined as a piece of information that has been derived t~om a  dynamically evolving information flow before it is converted into a stable form, e.g., hardcopy (Tonfoni 1996, 1998). Documents are derivative products of flows of conversations and various kinds of communicative intercourse, which may include a very high level of complexity and long duration. Documents and conversations, ~om which those documents were generated, are therefore two very tightly linked components that often play a crucial role in providing evidence for decision making. It is our claim that enhanced encoding procedures in the form of discourse tagging and labelling may be harmoniously linked by means of a consistent annotation system (Tonfoni 1998) to support accurate conversion of dialogues and discourse into a more stable format.</Paragraph>
    <Paragraph position="4"> This is what documentation is all about.</Paragraph>
    <Paragraph position="5"> The discourse tagging system presented here is based on and harmonious linked to an annotation system, which consists of a set of signs and symbols as follows: Discourse tagging and document annotation signs: to indicate the communicative function of a sequence of discourse, which is ultimately to become a piece of a document.</Paragraph>
    <Paragraph position="6"> Discourse tagging and document annotation symbols: to indicate the communicative s~le of each sequence of discourse, which is ultimately to become a piece of a document.</Paragraph>
    <Paragraph position="7"> - Discourse tagging and document annotation turn taking symbols: to indicate roles and interplay between the discourse partners that are carried along during the information conversion process and successively attached to the resulting document.</Paragraph>
    <Paragraph position="8"> Context sensitivity may be significantly enhanced by the consistent use of interpretation devices, designed to prevent filthiness and misunderstanding from occurring. Some contextual links, if not properly handled, may be powerful enough to radically shifl the scenario. The originating context may in fact be easily modified and completely distorted, even if unintentionally. Such links need to be accurately identified and then efiminated by repositioning, either by reassessing the originating context or assessing the new and intended one.</Paragraph>
    <Paragraph position="9"> A context sensitive mark up language for converting discourse sequences into document pieces.</Paragraph>
    <Paragraph position="10"> Discourse tagging and document annotation signs: The following represent the various communicative functions a docm~ent may convey, on a paragraph by paragraph basis, as a result of consistent conversation of discourse sequences into document pieces.</Paragraph>
    <Paragraph position="11"> II - Square: for an informative document or piece of a document, which carries information about a specific conversational event.</Paragraph>
    <Paragraph position="12"> Indicates that information conversion has been derived from an informative discourse sequence.</Paragraph>
    <Paragraph position="13"> D Square within the Square: for a summary of a given document that has been produced to reinforce contextual consistency between the conversational context in which the discourse first occurred and its conversion into a larger document.</Paragraph>
    <Paragraph position="14"> Frame: for a document or piece of a document that is found to be analogous (in content) to other</Paragraph>
    <Paragraph position="16"> documents which refer to pr~Aous!y stored information, some of which may still be available in discourse format. Normally, conversion occurs ~,om discourse format into document format.</Paragraph>
    <Paragraph position="17"> Triangle: for a memory and history generated out of a certain document. This is meant to establish topical continuity wdth background information, which may still only be available in its discourse format and still need to be converted into document format.</Paragraph>
    <Paragraph position="18"> Circle: for a main concept conveyed by a certain document, which has been abstracted and linked to other doc~ents, to show topical confimfity, It is meant to reinforce topical word identification and to effectb;e!y l'mk together documents -with the same word and sequences of discourse prior to their conversion into document format, Both discourse and document (may) use the same topical words.</Paragraph>
    <Paragraph position="19"> Grouped Semicircles: for main concepts, which are abstracted out of an originating document, Establishes both topical continu'~ and context consistency between the ori_ginafing document, and a set of topical words and links to sequences of discourse prior to conversion into a doc~.ment format, Here again, both discourse and document (may) use the same topical words.</Paragraph>
    <Paragraph position="20"> Semicircle: for a locally identified concept, abstracted out of a piece of document and meant to reinforce context consistency by establishing further links to other documents.</Paragraph>
    <Paragraph position="21"> These links may be triggered by the same topical word. It is also meant to trigger sequences of discourse prior to convermon into a document format and using the same topical word~ Inscribed .Ares: for indicating the need for an upgrade and/or update of a certain document. Indicates that a revision process is likely to occur, although it does not identify if it will be a major or minor revision.</Paragraph>
    <Paragraph position="22"> Revision may be based on conversion of discourse sequences into additional pieces of a document or into various alterations.</Paragraph>
    <Paragraph position="23"> Opened Text Space: for indicating that an upgrade and/or update has indeed occurred and that the document has now reached 'a new re~Asion state. It is meant to show that the structure of the ~evious document has been affected by discursive information, but does not identify if the revision has been a major or a minor one,, Right Triangle: for a comment made about a given document or piece of a document, for the case where more contextual information is needed. This information is not available in a document format and has to be derived f~om other ~xternal relevant sources based upon topical continuity.</Paragraph>
    <Paragraph position="24"> Discourse tagging and document annotation symbols are used to indicate communicative intentions and styles, locally within the discourse, such that discourse sequences can be consistently and accurate!y interpreted with the the additional information they provide. They are particularly useful for showing contributions made by individuals in either synchronous or asynchronous conversations, and for supporting co-ordination and &amp;quot;mPSormation com, ersion for production of a document,  These information-containing elements may be conveniently incorporated into the final document to provide clues about the nature of the original information conversion process. Document annotation symbols, therefore, represent different modes of information conversion (from a discourse) which may be packaged with the originating context, and activated at a later time. They may be combined and used dynamically for Narrate: from Latin tuTrro: tell the story.</Paragraph>
    <Paragraph position="25"> It means complementing the discourse or the document with various facts and events (from the originating context) by following a logical and chronological order. They may be used either in the form of discourse or in the form of the document itself. In other words, it indicates a set of major points or facts representing different diachronic further information conversion purposes, stages, which are closely linked. II such as further discussions, because they  effectively indicate transitional states w/thln ,~ Point out: take a single point 1 a discourse that may be evolving in time and -----o---- out of a story chain. \[\] space: They are of the following types: It means to isolate a specific</Paragraph>
    <Paragraph position="27"> Describe: from Latin describo: write around.</Paragraph>
    <Paragraph position="28"> It means complementing the original discourse or document by adding as much relevant information as may be found from previous discourse, without any specific constraints.</Paragraph>
    <Paragraph position="29"> It may also indicate the need for further information to be put together in discourse format or in document format. It is represented by a spiral, which starts from its middle point, to indicate a flow of information from topical words, towards an expanding topic and linking with other information. The other information can come from different discourse sequences or from other relevant documents or pieces of documents.</Paragraph>
    <Paragraph position="30"> Define: from Latin definio: put limits.</Paragraph>
    <Paragraph position="31"> It means complementing the document or the discourse with limited information about a defined topical word, which has been previously selected and identified as the most relevant. The point in the middle of the square represents relevance. The concept indicates that there is a real need to incorporate available, highly specific information about a relevant discourse or document.</Paragraph>
    <Paragraph position="32"> event or fact (among those reported as discourse or occurring within a document) by focusing on just one sequence or a piece of document, and adding more detailed reformation. Information is added by as a significant expansion and linked with other relevant discourse sequences or documents. Explain: from Latin explano: unwrap, open up.</Paragraph>
    <Paragraph position="33"> It means that facts and reasons are r * given to support interpretation of an event, within a certain discourse or document. Explanation may start by indicating the originating cause and proceeding logically toward the effects or by starting with the effects and going backwards towards the cause, depending on which approach is found to be the most effective.</Paragraph>
    <Paragraph position="34"> \[\[~\[ Regress: from Latin regredior: go back.</Paragraph>
    <Paragraph position="35"> I I ~1 &amp;quot;1 It means that more information about a topic, presented during a given discourse sequence or within a document, is absolutely necessary for understanding. Information may come in verbal format and then be converted into document format. It represents a topic-oriented process and an in depth information expansion, which is activated only for the precise topic being considered.</Paragraph>
    <Paragraph position="36">  Inform: from Latin informo: put into shape, shape up.</Paragraph>
    <Paragraph position="37"> It means that any discourse and document is the result of an information packaging process, and that the specific discourse and document under consideration is organised in the most unconstrained way, as the result of many information conversion operations.</Paragraph>
    <Paragraph position="38"> It leads toward two different kinds of further specification, which are respectively conveyed by the &amp;quot;inform synthetically&amp;quot; and the &amp;quot;inform analytically&amp;quot; indication. ''inform synthetically&amp;quot; means departing from a larger discourse or document and proceeding toward a summary (related to a specific topic) which is the most relevant one emerging from the originating discourse and document.</Paragraph>
    <Paragraph position="39"> &amp;quot;inform analytically&amp;quot; means departing from a limited discourse and document and expanding toward further discourse sequences and documents by adding more information, which needs to he then converted into the final form of a document, and is not available yet.</Paragraph>
    <Paragraph position="40"> Reformulate: from Latin reformo/reformulo: change shape and reshape.</Paragraph>
    <Paragraph position="41"> It means changing the style, which was previously adopted, either in the discourse or in the document, and substituting one form of information packaging with a different but related (same discourse or document) packaging. It may turn produce (a more or less) radical transformation of the originating context, according to a precisely defined request or set of requests.</Paragraph>
    <Paragraph position="42"> Express: from Latin exprimo: push out and press out.</Paragraph>
    <Paragraph position="43"> It means adding personal opinions and individual feelings related to facts and  events during a discourse or within a document. It indicates the most subjective mode of information organisation, which may be clearly influenced by and hound to personal evaluations, judgements and emotional states.</Paragraph>
    <Paragraph position="44"> Discourse tagging and document annotation turn taking symbols are meant to define the mode of interpretation of a certain discourse and of accessing a certain document, requested at each given time; they are provided to facilitate accurate context transport.</Paragraph>
    <Paragraph position="45"> They are the following ones: Major Scale: it shows that literal interpretation is needed and that those sequences of discourse or pieces of documents indicated and marked ot~ should be extracted and quoted literary, the way they were first intended to be.</Paragraph>
    <Paragraph position="46"> Minor Scale: k shows that accurate interpretation may need a further process of abstraction and that sequences of discourse and pieces of documents indicated and marked oK may need significant interpretation processes due to heavy context constraints.</Paragraph>
    <Paragraph position="47"> Open or Unsaturated Rhythm: it shows that accessing the discourse and document at the present stage may lead to an incomplete interpretation of those facts and events, which are presented. It is meant to suggest accessing a broader discourse and larger document and acquiring many and various kinds of sources, some of which may not yet be available.</Paragraph>
    <Paragraph position="48"> Tight or Saturated Rhythm: it shows that accessing the document will lead the user toward complete interpretation of those facts and events, which are presented. It suggests that the user should hold fast to the interpretation provided, though access to other sources of supportive evidence is also available.</Paragraph>
    <Paragraph position="49"> Discourse tagging and document annotation amplifier symbols come last and may be added only after the previously illustrated ones have been used. They apply to larger discourse sequences and documentation territories and indicate specific operations, which are to be performed in order to connect conversational actions and sets of documents, which have been previously encoded and accurately stored.</Paragraph>
    <Paragraph position="50"> They are as follows: Choose: it is meant to represent the &amp;quot;x..~ynamic process of first identif3~ing and then deciding between different contexts for interpretation, which are mutually exclusive, given a certain set of conversations, which have occurred and documents, which have been derived accordingly, but seem to have different possible evolutions.</Paragraph>
    <Paragraph position="51"> I Identify: it is meant to represent a &gt; definition of a more specific context within a broader context, for interpretation of a set of conversations, which have occurred and for documents that have been derived. It naturally occurs before &amp;quot;search&amp;quot; and &amp;quot;select&amp;quot;. Search: it is meant to represent the dynamic process of choosing among different contexts for interpretation of a set of conversations, which have occurred and documents, which have been derived and are many and compatible as to find the most appropriate one.</Paragraph>
    <Paragraph position="52"> Select: it is meant to represent ~ ~/ multiple contexts that may evolve (either synchronously or asynchronously) and may be modified, once a certain decision making process has been performed. It is based upon a certain discourse that was converted into a document and stored.</Paragraph>
    <Paragraph position="53"> Copy/Replicate: it is meant to represent the dynamic process of duplication and repetition of a context, which ff lost, would affect understanding and accuracy of interpretation of events and facts that have been organised, first in the form of discourse and then converted into a document.</Paragraph>
    <Paragraph position="54"> Ahead: it is meant to represent the progression of a conversation to become a document or set of documents, which are linked together by context consistency or harmoniously shifting contexts.</Paragraph>
    <Paragraph position="55"> Back: it is meant to represent the need to go back to delete and replace the originating context, which has radically shifted, in the course of various transition states, during an ongoing conversation such that, if not eliminated, would indeed affect consistent interpretation of a whole set of documents, which are based upon it.</Paragraph>
    <Paragraph position="56"> Conflict; it is meant to represent an emerging inconsistency and incompatibility between various context attributions to a set of conversations to be converted into documents, the context needs to be cleared as to proceed toward any further interpretation.</Paragraph>
    <Paragraph position="57"> The discourse tagging and document annotation system we have illustrated here (including its various components) may be appfied at different layers and at various levels of complexity.</Paragraph>
    <Paragraph position="58">  Following this perspective, discourse context may be enhanced through visual clues or symbols to provide a very powerful means to monitor inherent complexity of any communicative intercourse and to reduce possible distortions which may OCCur.</Paragraph>
    <Paragraph position="59"> Conclusions This system for tagging discourse provides consistent and harmonious linkages to the original context in the form of a mark up language for visually annotating documentation. It has been extensively and intensively tested in many and various context and languages .The acronym, CPP --TKS stands for Communicative Positioning Program-- Text Kepresentation Systems. What is therefore indicated is that icons representing precise operations performed upon texts, both in their verbal and written dimensions, carry the effective intentionality to be encapsulated. Encoding and preprogramming a document means precisely complementing a document with all of the instructions that are necessary to enhance understanding of context. Visual programming of a document (based on previous discourse) provides the ability to categorise and classify information and to carry the full details of a context into the (delicate) process of information conversion to ensure the most reliable final product, e.g., the document.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML