File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-1623_metho.xml

Size: 21,301 bytes

Last Modified: 2025-10-06 14:10:46

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-1623">
  <Title>Inducing Temporal Graphs</Title>
  <Section position="6" start_page="189" end_page="190" type="metho">
    <SectionTitle>
3 TDAG: A representation of temporal
</SectionTitle>
    <Paragraph position="0"> flow We view text as a linear sequence of temporal segments. Temporal focus is retained within a segment, but radically changes between segments. The length of a segment can range from a single clause to a sequence of adjacent sentences. Figure 1 shows a sample of temporal segments from a medical case summary. Consider as an example the segment S13 of this text. This segment describes an examination of a patient, encompassing several events and states (i.e., an abdominal and neurological examination). All of them belong to the same time frame, and temporal order between these events is not explicitly outlined in the text. We represent ordering of events as a temporal directed acyclic graph (TDAG). An example of the transitive reduction1 of a TDAG is shown in Figure 1. Edges in a TDAG capture temporal precedence relations between segments. Because the graph encodes an order, cycles are prohibited. We do not require the graph to be fully connected -- if the precedence relation between two nodes is not specified in the text, the corresponding nodes will not be connected. For instance, consider the segments S5 and S7 from Figure 1, which describe her previous tests and the history of eczema. Any</Paragraph>
    <Paragraph position="2"> order between the two events is consistent with our interpretation of the text, therefore we cannot determine the precedence relation between the segments S5 and S7.</Paragraph>
    <Paragraph position="3"> In contrast to many existing temporal representations (Allen, 1984; Pustejovsky et al., 2003), TDAG is a coarse annotation scheme: it does not capture interval overlap and distinguishes only a subset of commonly used ordering relations. Our choice of this representation, however, is not arbitrary. The selected relations are shown to be useful in text processing applications (Zhou et al., 2005) and can be reliably recognized by humans.</Paragraph>
    <Paragraph position="4"> Moreover, the distribution of event ordering links under a more refined annotation scheme, such as TimeML, shows that our subset of relations covers a majority of annotated links (Pustejovsky et al., 2003).</Paragraph>
  </Section>
  <Section position="7" start_page="190" end_page="191" type="metho">
    <SectionTitle>
4 Method for Temporal Segmentation
</SectionTitle>
    <Paragraph position="0"> Our first goal is to automatically predict shifts in temporal focus that are indicative of segment boundaries. Linguistic studies show that speakers and writers employ a wide range of language devices to signal change in temporal discourse (Bestgen and Vonk, 1995). For instance, the presence of the temporal anchor last year indicates the lack of temporal continuity between the current and the previous sentence. However, many of these predictors are heavily context-dependent and, thus, cannot be considered independently. Instead of manually crafting complex rules controlling feature interaction, we opt to learn them from data.</Paragraph>
    <Paragraph position="1"> We model temporal segmentation as a binary classification task. Given a set of candidate boundaries (e.g., sentence boundaries), our task is to select a subset of the boundaries that delineate temporal segment transitions. To implement this approach, we first identify a set of potential boundaries. Our analysis of the manually-annotated corpus reveals that boundaries can occur not only between sentences, but also within a sentence, at the boundary of syntactic clauses. We automatically segment sentences into clauses using a robust statistical parser (Charniak, 2000). Next, we encode each boundary as a vector of features. Given a set of annotated examples, we train a classifier2 to predict boundaries based on the following feature set: Lexical Features Temporal expressions, such as tomorrow and earlier, are among the strongest markers of temporal discontinuity (Passonneau, 1988; Bestgen and Vonk, 1995). In addition to a well-studied set of domain-independent temporal markers, there are a variety of domain-specific temporal markers. For instance, the phrase initial hospital visit functions as a time anchor in the medical domain.</Paragraph>
    <Paragraph position="2"> To automatically extract these expressions, we provide a classifier with n-grams from each of the candidate sentences preceding and following the candidate segment boundary.</Paragraph>
    <Paragraph position="3">  identifying such transitions is relevant for temporal segmentation.</Paragraph>
    <Paragraph position="4"> We quantify the strength of a topic change by computing a cosine similarity between sentences bordering the proposed segmentation. This measure is commonly used in topic segmentation (Hearst, 1994) under the assumption that change in lexical distribution corresponds to topical change.</Paragraph>
    <Paragraph position="5"> Positional Features Some parts of the document are more likely to exhibit temporal change than others. This property is related to patterns in discourse organization of a document as a whole.</Paragraph>
    <Paragraph position="6"> For instance, a medical case summary first discusses various developments in the medical history of a patient and then focuses on his current conditions. As a result, the first part of the summary contains many short temporal segments. We encode positional features by recording the relative position of a sentence in a document.</Paragraph>
    <Paragraph position="7"> Syntactic Features Because our segment boundaries are considered at the clausal level, rather than at the sentence level, the syntax surrounding a hypothesized boundary may be indicative of temporal shifts. This feature takes into account the position of a word with respect to the boundary. For each word within three words of the hypothesized boundary, we record its part-of-speech tag along with its distance from the boundary. For example, NNP+1 encodes the presence of a proper noun immediately following the proposed boundary.</Paragraph>
  </Section>
  <Section position="8" start_page="191" end_page="193" type="metho">
    <SectionTitle>
5 Learning to Order Segments
</SectionTitle>
    <Paragraph position="0"> Our next goal is to automatically construct a graph that encodes ordering relations between temporal segments. One possible approach is to cast graph construction as a standard binary classification task: predict an ordering for each pair of distinct segments based on their attributes alone. If a pair contains a temporal marker, like later, then accurate prediction is feasible. In fact, this method is commonly used in event ordering (Mani et al., 2003; Lapata and Lascarides, 2004; Boguraev and Ando, 2005). However, many segment pairs lack temporal markers and other explicit cues for ordering. Determining their relation out of context can be difficult, even for humans. Moreover, by treating each segment pair in isolation, we cannot guarantee that all the pairwise assignments are consistent with each other and yield a valid TDAG.</Paragraph>
    <Paragraph position="1"> Rather than ordering each pair separately, our ordering model relies on global inference. Given the pairwise ordering predictions of a local classifier3, our model finds a globally optimal assignment. In essence, the algorithm constructs a graph that is maximally consistent with individual ordering preferences of each segment pair and at the same time satisfies graph-level constraints on the TDAG topology.</Paragraph>
    <Paragraph position="2"> In Section 5.2, we present three global inference strategies that vary in their computational and linguistic complexity. But first we present our underlying local ordering model.</Paragraph>
    <Section position="1" start_page="191" end_page="192" type="sub_section">
      <SectionTitle>
5.1 Learning Pairwise Ordering
</SectionTitle>
      <Paragraph position="0"> Given a pair of segments (i, j), our goal is to assign it to one of three classes: forward, backward, and null (not connected). We generate the training data by using all pairs of segments (i, j) that belong to the same document, such that i appears before j in the text.</Paragraph>
      <Paragraph position="1"> The features we consider for the pairwise ordering task are similar to ones used in previous research on event ordering (Mani et al., 2003; Lapata and Lascarides, 2004; Boguraev and Ando, 2005).</Paragraph>
      <Paragraph position="2"> Below we briefly summarize these features.</Paragraph>
      <Paragraph position="3"> Lexical Features This class of features captures temporal markers and other phrases indicative of order between two segments. Representative examples in this category include domain-independent cues like years earlier and domain-specific markers like during next visit. To automatically identify these phrases, we provide a classifier with two sets of n-grams extracted from the first and the second segments. The classifier then learns phrases with high predictive power.</Paragraph>
      <Paragraph position="4"> Temporal Anchor Comparison Temporal anchors are one of the strongest cues for the ordering of events in text. For instance, medical case summaries use phrases like two days before admission and one day before admission to express relative order between events. If the two segments contain temporal anchors, we can determine their ordering by comparing the relation between the two anchors. We identified a set of temporal anchors commonly used in the medical domain and devised a small set of regular expressions for their  sis of temporal anchors as they were developed on the newspaper corpora, and are not suitable for analysis of medical  values that encode preceding, following and incompatible relations.</Paragraph>
    </Section>
    <Section position="2" start_page="192" end_page="192" type="sub_section">
      <SectionTitle>
Segment Adjacency Feature Multiple studies
</SectionTitle>
      <Paragraph position="0"> have shown that two subsequent sentences are likely to follow a chronological progression (Bestgen and Vonk, 1995). To encode this information, we include a binary feature that captures the adjacency relation between two segments.</Paragraph>
    </Section>
    <Section position="3" start_page="192" end_page="193" type="sub_section">
      <SectionTitle>
5.2 Global Inference Strategies for Segment
Ordering
</SectionTitle>
      <Paragraph position="0"> Given the scores (or probabilities) of all pairwise edges produced by a local classifier, our task is to construct a TDAG. In this section, we describe three inference strategies that aim to find a consistent ordering between all segment pairs. These strategies vary significantly in terms of linguistic motivation and computational complexity. Examples of automatically constructed TDAGs derived from different inference strategies are shown in  Order (NRO) The simplest way to construct a consistent TDAG is by adding segments in the order of their appearance in a text. Intuitively speaking, this technique processes segments in the same order as a reader of the text. The motivation underlying this approach is that the reader incrementally builds temporal interpretation of a text; when a new piece of information is introduced, the reader knows how to relate it to already processed text. This technique starts with an empty graph and incrementally adds nodes in order of their appearance in the text. When a new node is added, we greedily select the edge with the highest score that connects the new node to the existing graph, without violating the consistency of the TDAG. Next, we expand the graph with its transitive closure.</Paragraph>
      <Paragraph position="1"> We continue greedily adding edges and applying transitive closure until the new node is connected to all other nodes already in the TDAG. The process continues until all the nodes have been added to the graph.</Paragraph>
      <Paragraph position="2">  Our second inference strategy is also greedy. It aims to optimize the score of the graph. The score of the graph is computed by summing the scores of text (Wilson et al., 2001).</Paragraph>
      <Paragraph position="3"> its edges. While this greedy strategy is not guaranteed to find the optimal solution, it finds a reasonable approximation (Cohen et al., 1999).</Paragraph>
      <Paragraph position="4"> This method begins by sorting the edges by their score. Starting with an empty graph, we add one edge at a time, without violating the consistency constraints. As in the previous strategy, at each step we expand the graph with its transitive closure. We continue this process until all the edges have been considered.</Paragraph>
      <Paragraph position="5">  We can cast the task of constructing a globally optimal TDAG as an optimization problem. In contrast to the previous approaches, the method is not greedy. It computes the optimal solution within the Integer Linear Programming (ILP) framework.</Paragraph>
      <Paragraph position="6"> For a document with N segments, each pair of segments (i, j) can be related in the graph in one of three ways: forward, backward, and null (not connected). Let si-j, si-j, and sinotarrowbothj be the scores assigned by a local classifier to each of the three relations respectively. Let Ii-j, Ii-j, and Iinotarrowbothj be indicator variables that are set to 1 if the corresponding relation is active, or 0 otherwise. The objective is then to optimize the score of a TDAG by maximizing the sum of the scores of all edges in the graph:</Paragraph>
      <Paragraph position="8"> We augment this basic formulation with two more sets of constraints to enforce validity of the constructed TDAG.</Paragraph>
      <Paragraph position="9"> Transitivity Constraints The key requirement on the edge assignment is the transitivity of the resulting graph. Transitivity also guarantees that the graph does not have cycles. We enforce transitivity by introducing the following constraint for every triple (i, j, k): Ii-j + Ij-k [?]1 [?] Ii-k (4) If both indicator variables on the left side of the inequality are set to 1, then the indicator variable  on the right side must be equal to 1. Otherwise, the indicator variable on the right can take any value.</Paragraph>
    </Section>
    <Section position="4" start_page="193" end_page="193" type="sub_section">
      <SectionTitle>
Connectivity Constraints The connectivity
</SectionTitle>
      <Paragraph position="0"> constraint states that each node i is connected to at least one other node and thereby enforces connectivity of the generated TDAG. We introduce these constraints because manually-constructed TDAGs do not have any disconnected nodes. This observation is consistent with the intuition that the reader is capable to order a segment with respect to other segments in the TDAG.</Paragraph>
      <Paragraph position="2"> The above constraint rules out edge assignments in which node i has null edges to the rest of the nodes.</Paragraph>
      <Paragraph position="3"> Solving ILP Solving an integer linear program is NP-hard (Cormen et al., 1992). Fortunately, there exist several strategies for solving ILPs. We employ an efficient Mixed Integer Programming solver lp solve5 which implements the Branch-and-Bound algorithm. It takes less than five seconds to decode each document on a 2.8 GHz Intel Xeon machine.</Paragraph>
    </Section>
  </Section>
  <Section position="9" start_page="193" end_page="194" type="metho">
    <SectionTitle>
6 Evaluation Set-Up
</SectionTitle>
    <Paragraph position="0"> We first describe the corpora used in our experiments and the results of human agreement on the segmentation and the ordering tasks. Then, we introduce the evaluation measures that we use to assess the performance of our model.</Paragraph>
    <Section position="1" start_page="193" end_page="193" type="sub_section">
      <SectionTitle>
6.1 Corpus Characteristics
</SectionTitle>
      <Paragraph position="0"> We applied our method for temporal ordering to a corpus of medical case summaries. The medical domain has been a popular testbed for methods for automatic temporal analyzers (Combi and Shahar, 1997; Zhou et al., 2005). The appeal is partly due to rich temporal structure of these documents and the practical need to parse this structure for meaningful processing of medical data.</Paragraph>
      <Paragraph position="1"> We compiled a corpus of medical case summaries from the online edition of The New England Journal of Medicine.6 The summaries are  Hospital. A typical summary describes an admission status, previous diseases related to the current conditions and their treatments, family history, and the current course of treatment. For privacy protection, names and dates are removed from the summaries before publication.</Paragraph>
      <Paragraph position="2"> The average length of a summary is 47 sentences. The summaries are written in the past tense, and a typical summary does not include instances of the past perfect. The summaries do not follow a chronological order. The ordering of information in this domain is guided by stylistic conventions (i.e., symptoms are presented before treatment) and the relevance of information to the current conditions (i.e., previous onset of the same disease is summarized before the description of other diseases).</Paragraph>
    </Section>
    <Section position="2" start_page="193" end_page="193" type="sub_section">
      <SectionTitle>
6.2 Annotating Temporal Segmentation
</SectionTitle>
      <Paragraph position="0"> Our approach for temporal segmentation requires annotated data for supervised training. We first conducted a pilot study to assess the human agreement on the task. We employed two annotators to manually segment a portion of our corpus. The annotators were provided with two-page instructions that defined the notion of a temporal segment and included examples of segmented texts. Each annotator segmented eight summaries which on average contained 49 sentences. Because annotators were instructed to consider segmentation boundaries at the level of a clause, there were 877 potential boundaries. The first annotator created 168 boundaries, while the second -- 224 boundaries.</Paragraph>
      <Paragraph position="1"> We computed a Kappa coefficient of 0.71 indicating a high inter-annotator agreement and thereby confirming our hypothesis about the reliability of temporal segmentation.</Paragraph>
      <Paragraph position="2"> Once we established high inter-annotator agreement on the pilot study, one annotator segmented the remaining 52 documents in the corpus.7 Among 3,297 potential boundaries, 1,178 (35.7%) were identified by the annotator as segment boundaries. The average segment length is three sentences, and a typical document contains around 20 segments.</Paragraph>
    </Section>
    <Section position="3" start_page="193" end_page="194" type="sub_section">
      <SectionTitle>
6.3 Annotating Temporal Ordering
</SectionTitle>
      <Paragraph position="0"> To assess the inter-annotator agreement, we asked two human annotators to construct TDAGs from 7It took approximately 20 minutes to segment a case summary. null  five manually segmented summaries. These summaries consist of 97 segments, and their transitive closure contain a total of 1,331 edges. We computed the agreement between human judges by comparing the transitive closure of the TDAGs. The annotators achieved a surprisingly high agreement with a Kappa value of 0.98.</Paragraph>
      <Paragraph position="1"> After verifying human agreement on this task, one of the annotators constructed TDAGs for another 25 summaries.8 The transitive reduction of a graph contains on average 20.9 nodes and 20.5 edges. The corpus consists of 72% forward, 12% backward and 16% null segment edges inclusive of edges induced by transitive closure. At the clause level, the distribution is even more skewed -- forward edges account for 74% edges, equal for 18%, backward for 3% and null for 5%.</Paragraph>
    </Section>
    <Section position="4" start_page="194" end_page="194" type="sub_section">
      <SectionTitle>
6.4 Evaluation Measures
</SectionTitle>
      <Paragraph position="0"> We evaluate temporal segmentation by considering the ratio of correctly predicted boundaries.</Paragraph>
      <Paragraph position="1"> We quantify the performance using F-measure, a commonly used binary classification metric. We opt not to use the Pk measure, a standard topical segmentation measure, because the temporal segments are short and we are only interested in the identification of the exact boundaries.</Paragraph>
      <Paragraph position="2"> Our second evaluation task is concerned with ordering manually annotated segments. In these experiments, we compare an automatically generated TDAG against the annotated reference graph. In essence, we compare edge assignment in the transitive closure of two TDAGs, where each edge can be classified into one of the three types: forward, backward, or null.</Paragraph>
      <Paragraph position="3"> Our final evaluation is performed at the clausal level. In this case, each edge can be classified into one of the four classes: forward, backward, equal, or null. Note that the clause-level analysis allows us to compare TDAGs based on the automatically derived segmentation.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML