File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/05/w05-0404_intro.xml
Size: 6,970 bytes
Last Modified: 2025-10-06 14:03:06
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-0404"> <Title>Using Semantic and Syntactic Graphs for Call Classi cation</Title> <Section position="3" start_page="24" end_page="26" type="intro"> <SectionTitle> 2 Semantic and Syntactic Graphs </SectionTitle> <Paragraph position="0"> Consider the typical case, where only lexical information, i.e. word a2 -grams are used for call classi cation. This is equivalent to representing the words in an utterance as a directed acyclic graph where the words are the labels of the transitions and then extracting the transition a2 -grams from it. Figure 1 shows the graph for the example sentence I paid six dollars, where a8 bosa9 and a8 eosa9 denote the beginning and end of the sentence, respectively.</Paragraph> <Paragraph position="1"> Syntactic and semantic graphs are also directed acyclic graphs, formed by adding transitions encoding syntactic and semantic categories of words or word sequences to the word graph. The rst additional information is the part of speech tags of the words. In the graph, as a parallel transition for each word of the utterance, the part of speech category of the word is added, as shown in Figure 2 for the example sentence. Note that, the word is pre xed by the token WORD: and the part-of-speech tag is pre xed by the token POS:, in order to distinguish between different types of transitions in the graph.</Paragraph> <Paragraph position="2"> The other type of information that is encoded in these graphs is the syntactic parse of each utterance, namely the syntactic phrases with their head words.</Paragraph> <Paragraph position="3"> For example in the sentence I paid six dollars, six dollars is a noun phrase with the head word dollars.</Paragraph> <Paragraph position="4"> In Figure 2, the labels of the transitions for syntactic phrases are pre xed by the token PHRASE:. There- null fore, six dollars is also represented by the transition labeled PHRASE:NP dollars. As an alternative, one may drop the head word of the phrase from the representation, or insert an epsilon transition parallel to the transitions of the modi ers of the head word to eliminate them from some a2 -grams.</Paragraph> <Paragraph position="5"> Generic named entity tags, such as person, location and organization names and task-dependent named entity tags, such as drug names in a medical domain, are also incorporated into the graph, where applicable. For instance, for the example sentence, six dollars is a monetary amount, so the arc NE:m is inserted parallel to that sequence.</Paragraph> <Paragraph position="6"> As another source of semantic information, semantic role labels of the utterance components are incorporated to the SSGs. The semantic role labels represent the predicate/argument structure of each sentence: Given a predicate, the goal is to identify all of its arguments and their semantic roles. For example, in the example sentence the predicate is pay, the agent of this predicate is I and the amount is six dollars. In the graph, the labels of the transitions for semantic roles are pre xed by the token SRL: and the corresponding predicate. For example, the sequence six dollars is the amount of the predicate pay, and this is shown by the transition with label SRL:pay.A1 following the PropBank notation (Kingsbury et al., 2002)1.</Paragraph> <Paragraph position="7"> In this work, we were only able to incorporate part-of-speech tags, syntactic parses, named entity tags and semantic role labels in the syntactic and semantic graphs. Insertion of further information such as supertags (Bangalore and Joshi, 1999) or word stems can also be bene cial for further processing.</Paragraph> <Paragraph position="8"> 3 Using SSGs for Call Classi cation In this paper we propose extracting all a2 -grams from the SSGs to use them for call classi cation. The a2 -grams in an utterance SSG can be extracted by converting it to a nite state transducer (FST), a12a14a13 . Each transition of a12a15a13 has the labels of the arcs on the SSG as input and output symbols2. Composing this FST with another FST, a12a17a16 , representing all the possible a2 -grams, forms the FST, a12a15a18 , which includes all a2 -grams in the SSG: a12a19a18a21a20a22a12 a13a24a23 a12 a16 Then, extracting the a2 -grams in the SSG is equivalent to enumerating all paths of a12 a18 . For a2a25a20a27a26 , a12a28a16 is shown in Figure 3. The alphabet a10 contains all the symbols in a12a15a13 .</Paragraph> <Paragraph position="9"> We expect the SSGs to help call classi cation because of the following reasons: a6 First of all, the additional information is expected to provide some generalization, by allowing new a2 -grams to be encoded in the utterance graph since SSGs provide syntactic and semantic groupings. For example, the words a and the both have the part-of-speech tag category DT (determiner), or all the numbers are mapped to a cardinal number (CD), like the six in the example sentence. So the bi-grams WORD:six WORD:dollars and POS:CD WORD:dollars will both be in the SSG. Similarly the sentences I paid six dollars and I paid seventy ve dollars and sixty ve cents will both have the trigram WORD:I WORD:paid NE:m in their SSGs.</Paragraph> <Paragraph position="10"> a6 The head words of the syntactic phrases and predicate of the arguments are included in the SSGs. This enables the classi er to handle long distance dependencies better than using other simpler methods, such as extracting all gappy a2 -grams. For example, consider the following two utterances: I need a copy of my bill and I need a copy of a past due bill. As shown in Figures 4 and 5, the a2 -gram WORD:copy WORD:of PHRASE:NP bill appears for both utterances, since both subsequences my bill and a past due bill are nothing but noun phrases with the head word bill.</Paragraph> <Paragraph position="11"> a6 Another motivation is that, when using simply the word a2 -grams in an utterance, the classier is only given lexical information. Now the classi er is provided with more and different information using these extra syntactic and semantic features. For example, a named entity of type monetary amount may be strongly associated with some call-type.</Paragraph> <Paragraph position="12"> a6 Furthermore, there is a close relationship between the call-types and semantic roles. For example, if the predicate is order this is most probably the call-type Order(Item) in a retail domain application. The simple a2 -gram approach would consider all the appearances of the unigram order as equal. However consider the utterance I'd like to check an order of a different call-type, where the order is not a predicate but an object. Word a2 -gram features will fail to capture this distinction.</Paragraph> <Paragraph position="13"> Once the SSG of an utterance is formed, all the a2 -grams are extracted as features, and the decision of which one to select/use is left to the classi er.</Paragraph> </Section> class="xml-element"></Paper>