File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/02/c02-1041_metho.xml
Size: 20,502 bytes
Last Modified: 2025-10-06 14:07:43
<?xml version="1.0" standalone="yes"?> <Paper uid="C02-1041"> <Title>Automatic Semantic Grouping in a Spoken Language User Interface Toolkit</Title> <Section position="4" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 SLUITK </SectionTitle> <Paragraph position="0"> As mentioned in the previous section, the Spoken Language User Interface Toolkit (SLUITK) allows programmers with no linguistic knowledge to rapidly develop a spoken language user interface for their applications. The toolkit should incorporate the major components of an NLP front end, such as a spell checker, a parser and a semantic representation generator. Using the toolkit, a programmer will be able to create a system that incorporates complex NLP techniques such as syntactic parsing and semantic understanding.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.1 The Work Flow </SectionTitle> <Paragraph position="0"> Using an Automatic Speech Recognition (ASR) system, the SLUITK connects user input to the application, allowing spoken language control of the application. The SLUITK generates semantic representations of each input sentence. We refer to each of these semantic representations as a frame, which is basically a predicate-argument representation of a sentence.</Paragraph> <Paragraph position="1"> The SLUITK is implemented using the following steps: 1. SLUITK begins to create a SLUI by generating semantic representations of sample input sentences provided by the programmer.</Paragraph> <Paragraph position="2"> 2. These representations are expanded using synonym sets and other linguistic devices, and stored in a Semantic Frame Table (SFT). The SFT becomes a comprehensive database of all the possible commands a user could request a system to do. It has the same function as the database of parallel translations in an Example-based machine translation system (Sumita and Iida, 1991).</Paragraph> <Paragraph position="3"> 3. The toolkit then creates methods for attaching the SLUI to the back end applications.</Paragraph> <Paragraph position="4"> 4. When the SLUI enabled system is released, a user may enter an NL sentence, which is translated into a semantic frame by the system. The SFT is then searched for an equivalent frame. If a match is found, the action or command linked to this frame is executed.</Paragraph> <Paragraph position="5"> In a real application, a large number of frames might be generated from a domain corpus. The semantic grouper takes the set of frames as the input and outputs the same frames organized in a logical manner.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 2.2 The Corpus </SectionTitle> <Paragraph position="0"> We use a corpus of email messages from our customers for developing and testing the system. These email messages contain questions, comments and general inquiries regarding our document-conversion products.</Paragraph> <Paragraph position="1"> We modified the raw email programmatically to delete the attachments, HTML and other tags, headers and sender information. In addition, we manually deleted salutations, greetings and any information that was not directly related to customer support. The corpus contains around 34,640 lines and 170,000 words. We constantly update it with new email from our customers.</Paragraph> <Paragraph position="2"> We randomly selected 150 sentential inquiries to motivate and test the semantic grouping methods discussed in this paper.</Paragraph> </Section> </Section> <Section position="5" start_page="0" end_page="0" type="metho"> <SectionTitle> 3 Semantic Grouping </SectionTitle> <Paragraph position="0"> We have mentioned in Section 1 that grouping semantic frames is domain dependent.</Paragraph> <Paragraph position="1"> Grouping depends on the nature of the application and also the needs of the domain programmer. Since this is a real world problem, we have to consider the efficiency of grouping. It is not acceptable to let the programmer wait for hours to group one set of semantic forms. The grouping should be fairly fast, even on thousands of frames.</Paragraph> <Paragraph position="2"> These different considerations motivate several grouping methods: similarity-based grouping, verb-based grouping and category-based grouping. In this section, we describe each of these methods in detail.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.1 Similarity-based Grouping </SectionTitle> <Paragraph position="0"> Similarity-based grouping gathers sentences with similar meanings together, e.g., sentences (1.1) to (1.4). There is a wide application for this method. For example, in open domain question-answering systems, questions need to be reformulated so that they will match previously posted questions and therefore use the cached answers to speed up the process (Harabagiu et al., 2000).</Paragraph> <Paragraph position="1"> The question reformulation algorithm of Harabagiu et al. tries to capture the similarity of the meanings expressed by two sentences.</Paragraph> <Paragraph position="2"> For a given set of questions, the algorithm formulates a similarity matrix from which reformulation classes can be built. Each class represents a class of equivalent questions.</Paragraph> <Paragraph position="3"> The algorithm for measuring the similarity between two questions tries to find lexical relationships between every two questions that do not contain stop words. The algorithm makes use of the WordNet concept hierarchy (Fellbaum, 1998) to find synonym and hypernym relations between words.</Paragraph> <Paragraph position="4"> This algorithm does not infer information about the meanings of the questions, but rather uses some kind of similarity measurement in order to simulate the commonality in meaning.</Paragraph> <Paragraph position="5"> This is a simplified approach. Using different threshold, they can achieve different degrees of similarity, from almost identical to very different.</Paragraph> <Paragraph position="6"> This method can be used for similarity-based grouping to capture the similarity in meanings expressed by different sentences.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.2 Verb-based Grouping </SectionTitle> <Paragraph position="0"> Among the sentences normally used in the e-business domain, imperative sentences often appear in sub-domains dominated by command-and-control requests. In such an application, the verb expresses the command that the user wants to execute and therefore plays the most important role in the sentence.</Paragraph> <Paragraph position="1"> Based on this observation, a grouping can be based on the verb or verb class only. For example, sentences with buy or purchase etc.</Paragraph> <Paragraph position="2"> as the main verbs are classified into one group whereas those with download as the main verb are classified into a different group, even when the arguments of the verbs are the same.</Paragraph> <Paragraph position="3"> This is similar to sorting frames by the verb, taking into account simple verb synonym information.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 3.3 Category-based Grouping </SectionTitle> <Paragraph position="0"> Since SLUITK is a generic toolkit whereas the motivation for grouping is application dependent, we need to know how the programmer wants the groups to be organized.</Paragraph> <Paragraph position="1"> We randomly selected 100 sentences from our corpus and asked two software engineers to group them in a logical order. They came up with very different groups, but their thoughts behind the groups are more or less the same.</Paragraph> <Paragraph position="2"> This motivates the category-based grouping.</Paragraph> <Paragraph position="3"> This grouping method puts less emphasis on each individual sentence, but tries to capture the general characteristics of a given corpus. For example, we want to group by the commands (e.g., buy) or objects (e.g., a software) the corpus is concerned with. If a keyword of a category appears in a given sentence, we infer that sentence belongs to the category. For example, sentences (1.6) to (1.8) will be grouped together because they all contain the keyword price.</Paragraph> <Paragraph position="4"> These sentences will not be grouped together by the similarity-based method because their similarity is not high enough, nor by the verb-based method because the verbs are all different.</Paragraph> </Section> </Section> <Section position="6" start_page="0" end_page="0" type="metho"> <SectionTitle> 4 Grouping in SLUITK </SectionTitle> <Paragraph position="0"> Because we cannot foresee the domain needs of the programmer, we implemented all three methods in SLUITK so that the programmer can view their data in several different ways.</Paragraph> <Paragraph position="1"> The programmer is able to choose which type of grouping scheme to implement.</Paragraph> <Paragraph position="2"> In the question reformulation algorithm of (Harabagiu, et al. 2000), all words are treated identically in the question similarity measurement. However, our intuition from observing the corpus is that the verb and the object are more important than other components of the sentence and therefore should be given more weight when measuring similarity. In Section 4.1, we describe our experiment with the grouping parameters to test our intuition.</Paragraph> <Section position="1" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.1 Experimenting with Parameters </SectionTitle> <Paragraph position="0"> We think that there are two main parameters affecting the grouping result: the weight of the syntactic components and the threshold for the similarity measurement in the similarity-based method. Using 100 sentences from our corpus, we tried four different types of weighting scheme and three thresholds with the category-based methods. Human judgment on the generated groups confirmed our intuition that the object plays the most important role in grouping and the verb is the second most important. The differences in threshold did not seem to have a significant effect on the similarity-based grouping. This is probably due to the strict similarity measurement.</Paragraph> <Paragraph position="1"> This experiment gives us a relatively optimal weighting scheme and threshold for the similarity-based grouping.</Paragraph> <Paragraph position="2"> One relevant issue concerns the simplification of the semantic frames. For a sentence with multiple verbs, we can simplify the frame based on the verbs used in the sentence. The idea is that some verbs such as action verbs are more interesting in the e-business domain than others, e.g., be and have. If we can identify such differences in the verb usage, we can simplify the semantic frames by only keeping the interesting verb frames. For example, in the following sentences, the verb buy is more interesting than be and want, and the generated semantic frames should contain only the frame for buy.</Paragraph> <Paragraph position="3"> (4.1) Is it possible to buy this software online? (4.2) I want to buy this software online.</Paragraph> <Paragraph position="4"> We make use of a list of stop-words from (Frakes, 1992) in order to distinguish between interesting and uninteresting verbs. We look for frames headed by stop-words and follow some heuristics to remove the sub-frame of the stop-word. For example, if there is at least one verb that is not a stop-word, we remove all other stop-words from the frame. In the sentence [Is it possible to] buy the software in Germany?, be is a stop-word, so only the frame for buy is kept. This process removes the redundant part of a frame so that the grouping algorithm only considers the most important part of a frame.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="sub_section"> <SectionTitle> 4.2 Implementation in SLUITK </SectionTitle> <Paragraph position="0"> Figure 1 shows a screen shot of the interface of the SLUITK, which shows several grouped semantic frames. In this section, we give more detail about the implementation of the three grouping methods used in SLUITK.</Paragraph> <Paragraph position="1"> Similarity-based grouping Similar to (Harabagiu, et al. 2001), our similarity-based grouping algorithm calculates the similarity between every two frames in the input collection. If the similarity is above a certain threshold, the two frames are considered similar and therefore should be grouped together. If two frames in two different groups are similar, then the two groups should be combined to a single group.</Paragraph> <Paragraph position="2"> The central issue here is how to measure the similarity between two frames.</Paragraph> <Paragraph position="3"> Since we have found that some syntactic components are more important to grouping than others, we use a weighted scheme to measure similarity. For each frame, all words (except for stop-words) are extracted and used for similarity calculation. We give different weights to different sentence components.</Paragraph> <Paragraph position="4"> Since in an e-business domain, the verb and the object of a sentence are usually more important than other components because they express the actions that the programmers want to execute, or the objects for which they want to get more information, the similarity of these components are emphasized through the weighting scheme. The similarity score of two frames is the summation of the weights of the matched words.</Paragraph> <Paragraph position="5"> There is a match between two words when we find a lexical relationship between them.</Paragraph> <Paragraph position="6"> We extend the method of (Harabagiu, et al.</Paragraph> <Paragraph position="7"> 2000) and define a lexical relationship between two words W1 and W2 as in the following: 1. If W1 and W2 have a common morphological root. Various stemming packages can be used for this purpose, for example, Porter Stemmer (Porter, 1997).</Paragraph> <Paragraph position="8"> 2. If W1 and W2 are synonyms, i.e., W2 is in the WordNet synset of W1.</Paragraph> <Paragraph position="9"> 3. If the more abstract word is a WordNet hypernym of the other.</Paragraph> <Paragraph position="10"> 4. If one word is the WordNet holonym of the other (signaling part of, member of and substance of relations); 5. If W1 is the WordNet antonym of W2.</Paragraph> <Paragraph position="11"> Domain specific heuristics can also be used to connect words. For example, in the e-business domain, you and I can be treated as antonyms in the following sentences: (4.3) Can I buy this software? (4.4) Do you sell this software? When none of the above is true, there is no lexical relation between two given words.</Paragraph> <Paragraph position="12"> Because the similarity-based grouping needs to consult WordNet frequently for lexical relations, it becomes very slow for even a few hundred frames. We have to change the algorithm to speed up the process, as it is too slow for real world applications.</Paragraph> <Paragraph position="13"> Instead of comparing every two frames, we put all the words from an existing group together. When a new frame is introduced, we compare the words in this new frame with the word collection of each group. The similarity scores are added up as before, but it needs to be normalized over the number of words in the collection. When the similarity is above a certain threshold, the new frame is classified as a member of the group. This significantly reduces the comparison needed for classifying a frame, and therefore reduces the number of times WordNet needs to be consulted.</Paragraph> <Paragraph position="14"> We compared this improved algorithm with the original one on 30 handcrafted examples; the generated groups are very similar.</Paragraph> <Paragraph position="15"> Verb-based grouping The verb-based grouping implementation is fairly straightforward and has been described in Section 3.2.</Paragraph> <Paragraph position="16"> Category-base grouping For the category-based method, we first count all the non stop-words in a given corpus and retrieve a set of most frequent words and their corresponding word classes from the corpus.</Paragraph> <Paragraph position="17"> This process also makes use of the WordNet synonym, hypernym, holonym and antonym information. These word classes form the categories of each group. We then check the verbs and objects of each sentence to see if they match these words. That is, if a category word or a lexically related word appears as the verb or the object of a sentence, the sentence is classified as a member of that group. For example, we can pick the most frequent 20 words and divide the corpus into 21 groups, where the extra group contains all sentences that cannot be classified. The programmer can decide the number of groups they want. This gives the programmer more control over the grouping result.</Paragraph> </Section> </Section> <Section position="7" start_page="0" end_page="0" type="metho"> <SectionTitle> 5 Discussion </SectionTitle> <Paragraph position="0"> We tested the three methods on 100 sentences from our corpus. We had 5 people evaluate the generated groups. They all thought that grouping was a very useful feature of the toolkit. Based on their comments, we summarize the pros and cons of each method in Table 1.</Paragraph> <Paragraph position="1"> The similarity-based grouping produces a large number of groups, most of which contain only one sentence. This is because there are usually several unrelated words in each sentence, which decreases the similarity scores. In addition, using WordNet we sometimes miss the connections between lexical items. The verb-based grouping Similarity-based Verb-based Category-based Group Size small small large Number of Groups large large variable Speed slow on large corpus fast slow on large corpus Application general command-and-control only general produces slightly larger groups, but also produces many single sentence groups.</Paragraph> <Paragraph position="2"> Another problem is that when sentences contain only stop-word verbs, e.g., be, the group will look rather arbitrary. For example, a group of sentences with be as the main verb can express completely different semantic meanings. The small group size is a disadvantage of both methods. The number of groups of the category-based grouping can change according to the user specification. In general it produces less groups than the other methods and the group size is much larger, but the size becomes smaller for less frequent category words.</Paragraph> <Paragraph position="3"> Both the similarity-based and category-based grouping methods are slow because they frequently need to use WordNet to identify lexical relationships. The verb-based method is much faster, which is the primary advantage of this method.</Paragraph> <Paragraph position="4"> The verb-based method should be used in a command-and-control domain because it requires at least one non stop-word verb in the sentence. However, it will have a hard time in a domain that needs to handle questions. From the point of view of assigning a domain specific action to a group, this grouping is the best because each verb can be mapped to an action. Therefore, the programmer can link an action to each group rather than to each individual frame. When the group size is relatively large, this can greatly reduce the workload of the programmer.</Paragraph> <Paragraph position="5"> The category-based method produces a better view of the data because the sentences in each group seem to be consistent with the keywords of the category. The disadvantage is that it is difficult to link a group to a single action, and the programmer might have to reorganize the groups during action assignment. The similarity-based method did not perform well on the testing corpus, but it might work better on a corpus containing several different expressions of the same semantic information.</Paragraph> <Paragraph position="6"> In summary, each method has its advantages and disadvantages. The decision of which one to choose depends mainly on the needs of the domain programmer and the composition of the input corpus.</Paragraph> </Section> class="xml-element"></Paper>