File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/06/w06-3408_metho.xml

Size: 14,498 bytes

Last Modified: 2025-10-06 14:10:59

<?xml version="1.0" standalone="yes"?>
<Paper uid="W06-3408">
  <Title>ChAT: A Time-Linked System for Conversational Analysis</Title>
  <Section position="3" start_page="50" end_page="52" type="metho">
    <SectionTitle>
2 ChAT Architecture
</SectionTitle>
    <Paragraph position="0"> ChAT is a text processing tool designed to aid in the analysis of any kind of threaded dialogue, including meeting transcripts, telephone transcripts, usenet groups, chat room, email or blogs. Figure 1 illustrates the data processing flow in ChAT.</Paragraph>
    <Paragraph position="1">  Data is ingested via an ingest engine, then the central processing engine normalizes the format (time stamp, speaker ID, utterance; one utterance per line). Processing components are called by the central processing engine which provides the input to each component, and collects the output to send to the UI.</Paragraph>
    <Paragraph position="2"> We designed the system to be general enough to handle multiple data types. Thus, with the exception of the ingest engine, the processing components are domain and source independent. For example, we did not want the topic segmentation to rely on features specific to a dataset, such as acoustic information from transcripts. Additionally, all processing components have been built as independent plug-ins to the processing engine: The input of one does not rely on the output of the others. This allows for a great deal of flexibility in that a user can choose to include or exclude various processes to suit their needs, or even exchange the components with new tools. We describe each of the processing components in the next section.</Paragraph>
    <Section position="1" start_page="50" end_page="51" type="sub_section">
      <SectionTitle>
2.1 Ingest Engine
</SectionTitle>
      <Paragraph position="0"> The ingest engine is designed to input multiple data sources and transform them into a uniform structure which includes one utterance per line, including time stamp and participant information.</Paragraph>
      <Paragraph position="1"> So far, we have ingested three data sources. The ICSI meeting corpus (Janin et al., 2003) is a corpus of text transcripts of research meetings. There are  75 meetings in the corpus, lasting 30 minutes to 1.5 hours in duration, with 5-8 participants in each meeting. A subset of these meetings were hand coded for topic segments (Galley, et al., 2003). We also used telephone transcripts from the August 14, 2003 power grid failure that resulted in a regional blackout  . These data consist of files containing transcripts of multiple telephone conversations between multiple parties. Lastly, we employed a chat room dataset that was built in-house by summer interns who were instructed to play a murder mystery game over chat. Participants took on a character persona as their login and content was based on a predefined scenario, but all interactions were unscripted beyond that.</Paragraph>
      <Paragraph position="2">  0.032121 for two-sample equal variance t-test.</Paragraph>
    </Section>
    <Section position="2" start_page="51" end_page="52" type="sub_section">
      <SectionTitle>
2.2 Topic Segmentation
</SectionTitle>
      <Paragraph position="0"> The output of the ingest process is a list of utterance that include a time (or sequence) stamp, a participant name, and an utterance. Topic segmentation is then performed on the utterances to chunk them into topically cohesive units. Traditionally, algorithms for segmentation have relied on textual cues (Hearst, 1997; Miller et al. 1998; Beeferman et al, 1999; Choi, 2000). These techniques have proved useful in segmenting single authored documents that are rich in content and where there is a great deal of topic continuity. Topic segmentation of conversational data is much more difficult due to often sparse content, intertwined topics, and lack of topic continuity.</Paragraph>
      <Paragraph position="1"> Topic segmentation algorithms generally rely on a lexical cohesion signal that requires smoothing in order to eliminate noise from changes of word choices in adjoining statements that do not indicate topic shifts (Hearst, 1997; Barzilay and Elhadad, 1997). Many state of the art techniques use a sliding window for smoothing (Hearst, 1997; Miller et al. 1998; Galley et al., 2003). We employ a windowless method (WLM) for calculating a suitable cohesion signal which does not rely on a sliding window to achieve the requisite smoothing for an effective segmentation. Instead, WLM employs a constrained minimal-spanning tree (MST) algorithm to find and join pairs of elements in a sequence. In most applications, the nearest-neighbor search used by an MST involves an exhaustive, O(N2), search throughout all pairs of elements.</Paragraph>
      <Paragraph position="2"> However since WLM only requires information on the distance between adjoining elements in the sequence the search space for finding the two closest adjoining elements is linear, O(N), where N is the number of elements in the sequence. We can therefore take advantage of the hierarchical summary structure that an MST algorithm affords while not incurring the performance penalty.</Paragraph>
      <Paragraph position="3"> Of particular interest for our research was the success of WLM on threaded dialogue. We evaluated WLM's performance on the ICSI meeting corpus (Janin et al, 2003) by comparing our segmentation results to the results obtained by implementing LCSeg (Galley et al., 2003). Using the 25 hand segmented meetings, our algorithm achieved a significantly better segmentation for 20 out of 25 documents. Figure 2 shows the hypothesized segments from the two algorithms on the ICSI Meeting Corpus.</Paragraph>
      <Paragraph position="4"> Topic segmentation of conversational data can be aided by employing features of the discourse or speech environment, such as acoustic cues, etc.</Paragraph>
      <Paragraph position="5"> (Stolcke et al., 1999; Galley et al., 2003). In this work, we have avoided using data dependent (the  integration of acoustic cues for speech transcripts, for example) features to aid in segmentation because we wanted our system to be as versatile as possible. This approach provides the best segmentation possible for a variety of data sources, regardless of data type.</Paragraph>
    </Section>
    <Section position="3" start_page="52" end_page="52" type="sub_section">
      <SectionTitle>
2.3 Named Entity Extraction
</SectionTitle>
      <Paragraph position="0"> In addition to topics, ChAT also has integrated software to extract the named entities. We use Cicero Lite from the Language Computer Corporation (LCC) for our entity detection (for a product description and evaluation, see Harabagiu et al., 2003). Using a combination of semantic representations and statistical approaches, Cicero Lite isolates approximately 80 entity types. ChAT currently makes use of only a handful of these categories, but can easily be modified to include more. Because named entity extraction relies on cross-utterance dependencies, the main processing engine sends all utterance from a conversation at once rather than an utterance at a time.</Paragraph>
    </Section>
    <Section position="4" start_page="52" end_page="52" type="sub_section">
      <SectionTitle>
2.4 Sentiment Analysis
</SectionTitle>
      <Paragraph position="0"> In addition to topic and entity extraction, conversations can also be analyzed by who participated in them and their relationship to one another and their attitude toward topics they discuss. In an initial attempt to capture participant attitude, we have included a sentiment analysis, or affect, component. Sentiment analysis is conducted via a lexical approach. The lexicon we employed is the General Inquirer (GI) lexicon developed for content analyses of textual data (Stone, 1977). It includes an extensive lexicon of over 11,000 hand coded word stems, and 182 categories, but our implementation is limited to positive (POS) and negative (NEG) axes. In ChAT, every utterance is scored for the number of positive and negative words it contains.</Paragraph>
      <Paragraph position="1"> We make use of this data by keeping track of the affect of topics in general, as well as the general mood of the participants.</Paragraph>
    </Section>
    <Section position="5" start_page="52" end_page="52" type="sub_section">
      <SectionTitle>
2.5 Participant Roles
</SectionTitle>
      <Paragraph position="0"> Analyzing conversations consists of more than analyzing the topics within them. Inherent to the nature of conversational data are the participants.</Paragraph>
      <Paragraph position="1"> Using textual cues, one can gain insight into the relationships of participants to each other and the topics. In ChAT we have integrated several simple metrics as indicators of social dynamics amongst the participants. Using simple speaker statistics, such as number of utterances, number of words, etc., we can gain insight to the level of engagement of participants in the conversation. Features we use include:  (ones not preceded by a question mark) Additionally, we use the same lexical resources as we use for sentiment analysis for indications of personality type. We make use of the lexical categories of strong, weak, power cooperative, and power conflict as indicators of participant roles in the conversational setting. Thus far, we have not conducted any formal evaluation on the sentiment analysis with this data, but our initial studies of our pos and neg categories show a 73% agreement with hand tagged positive and negative segments on a different data set.</Paragraph>
    </Section>
  </Section>
  <Section position="4" start_page="52" end_page="55" type="metho">
    <SectionTitle>
3 User Interface
</SectionTitle>
    <Paragraph position="0"> As described in Section 2 on ChAT architecture, the processing components are independent of the UI, but we do have a built-in UI that incorporates the processing components that is designed to aid in analyzing conversations.</Paragraph>
    <Paragraph position="1">  the time axis are based on either the time stamp or sequential position of the utterance. The default time range is the whole conversation or chat room session, but a narrower range can be selected by dragging in the interval panel at the top of the UI. Note that all of the values for each of the components are recalculated based on the selected time interval. Figure 4 shows that a time selection results in a finer grained subset of the data, allowing one to drill down to specific topics of inter- null The number of utterance for a given time frame is indicated by the number inside the box corresponding to the time frame. The number is recalculated as different time frames are selected.  The central organizing unit in the UI is topics. The topic panel, detailed in Figure 5, consists of three parts: the color key, affect scores, and topic labels. Once a data file is imported into the UI, topic segmentation is performed on the dataset according to the processes outline in Section 3.2. Topic labels are assigned to each topic chunk. Currently, we use the most prevalent word tokens as the label, and the user can control the number of words per label. Each topic segment is assigned a color, which is indicated by the color key. The persistence of a color throughout the time axis indicates which topic is being discussed at any given time. This allows a user to quickly see the distribution of topics of a meeting, for example. It also allows a user to quickly see the participants who discussed a given topic.</Paragraph>
    <Paragraph position="2">  Affect scores are computed for each topic by counting the number of POS or NEG affect words in each utterance that comprises a topic within the selected time interval. Affect is measured by the proportion of POS to NEG words in the selected time frame. If the proportion is greater than 0, the score is POS (represented by a +), if it is less than 0, it is NEG (-). The degree of sentiment is indicated by varying shades of color on the + or symbol. null Note that affect is computed for both topics and participants. An affect score on the topic panel indicates overall affect contained in the utterances present in a given time frame, whereas the affect score in the participant panel indicates overall affect in a given participant's utterances for that time frame.</Paragraph>
    <Section position="1" start_page="54" end_page="55" type="sub_section">
      <SectionTitle>
3.1.3 Participants
</SectionTitle>
      <Paragraph position="0"> The participant panel (Figure 6) consists of three parts: speaker labels, speaker contribution bar, and affect score. The speaker label is displayed in alphabetical order and is grayed out if there are no utterances containing the topic in the selected time frame. The speaker contribution bar, displayed as a horizontal histogram, shows the speaker's proportion of utterances during the time frame. Non question utterances are displayed in red while utterances containing questions are displayed in green as seen. For example, in Figure 6, we can see that speaker me011 did most of the talking (and was generally negative), but speaker me018 had a  In the current implementation, the named entity panel consists of only list of entity labels present in a given time frame. We do not list each named entity because of space constraints, but plan to integrate a scroll bar so that we can display individual entities as opposed to the category labels.</Paragraph>
    </Section>
    <Section position="2" start_page="55" end_page="55" type="sub_section">
      <SectionTitle>
3.2 Semantic Graph
</SectionTitle>
      <Paragraph position="0"> Data that is viewed in the main UI can be sent to a semantic graph for further analysis. The graph allows a user to choose to highlight the relationships associated with a topic, participant, or individual named entity. The user selects objects of interest from a list (see Figure 7), then the graph function organizes a graph according to the chosen object, see Figure 8, that extracts the information from the time-linked view and represent it in a more abstract view that denotes relationships via links and nodes.</Paragraph>
      <Paragraph position="1">  The semantic graph can help highlight relationships that might be hard to view in the main UI. For example, Figure 8 represents a subset of the Blackout data in which three participants, indicated by blue, all talked about the same named entity, indicated by green, but never talked to each other, indicated by the red conversation nodes.</Paragraph>
    </Section>
  </Section>
class="xml-element"></Paper>
Download Original XML