File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/c02-1041_intro.xml

Size: 5,706 bytes

Last Modified: 2025-10-06 14:01:17

<?xml version="1.0" standalone="yes"?>
<Paper uid="C02-1041">
  <Title>Automatic Semantic Grouping in a Spoken Language User Interface Toolkit</Title>
  <Section position="3" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction and Motivation
</SectionTitle>
    <Paragraph position="0"> With the improvement of natural language processing (NLP) and speech recognition techniques, spoken language will become the input of choice for software user interfaces, as it is the most natural way of communication. In the mean time, the domains for NLP systems, especially those handling speech input, have grown rapidly in recent years. However, most computer programmers do not have enough linguistic knowledge to develop an NLP system to handle speech input. There is a genuine demand for a general toolkit from which programmers with no linguistic knowledge can rapidly build speech based NLP systems to handle their domain specific problems more accurately (Alam, 2000). The toolkit will allow programmers to generate Spoken Language User Interface (SLUI) front ends for new and existing applications using, for example, a program-through-example method. In this methodology, the programmer will specify a set of sample input sentences or a domain corpus for each task. The toolkit will then organize the sentences by meaning and even generate a large set of syntactic variations for a given sentence. It will also generate the code that takes a user's spoken request and executes a command on an application. This methodology is similar to using a GUI toolkit to develop a graphical user interface so that programmers can develop GUI without learning graphics programming. Currently this is an active research area, and the present work is funded by the Advanced Technology Program (ATP) of the National Institute of Standards and Technology (NIST).</Paragraph>
    <Paragraph position="1"> In the program-through-example approach, the toolkit should provide an interface for the programmers to input domain specific corpora and then process the sentences into semantic representations so as to capture the semantic meanings of the sentences. In a real world application, this process results in a large number of semantic forms. Since the programmers have to manually build the links between these forms and their specific domain actions, they are likely to be overwhelmed by the workload imposed by the large number of individual semantic forms. In order to significantly reduce this workload, we can organize these forms in such a way so that the programmers can manipulate them as groups rather than as individual items. This will speed up the generation process of the domain specific SLUI system. We call this process the semantic grouping process.</Paragraph>
    <Paragraph position="2"> One straightforward way to group is to organize different syntactic forms expressing the same meaning together. For example,  (1.1) I want to buy this book online.</Paragraph>
    <Paragraph position="3"> (1.2) Can I order this book online? (1.3) How can I purchase this book online? (1.4) What do I need to do to buy this book online?  The semantic forms of the above sentences may not be the same, but the action the programmer has in mind in an e-business domain is more or less the same: to actually buy the book online. In addition to the above sentences, there are many variations that an end-user might use. The embedded NLP system should be able to recognize the similarity among the variations so that the SLUI system can execute the same command upon receiving the different queries. This requires a group to contain only sentences with the same meaning. However in real applications, this might be difficult to achieve because user requests often have slight differences in meaning.</Paragraph>
    <Paragraph position="4"> This difficulty motivates a different style for semantic grouping: organizing the semantic forms into groups so that those in the same group can be mapped roughly to the same action. The action can be either a command, e.g., buy something, or concerning an object, e.g., different ways of gathering information about an object. For example, sentence (1.5) would be grouped together with the above example sentences because it poses the same request: buy books; and sentences (1.6) to (1.8) would be in one group because they are all about price information.</Paragraph>
    <Paragraph position="5">  (1.5) I want to buy the latest book about ebusiness. null (1.6) Please send me a price quote.</Paragraph>
    <Paragraph position="6"> (1.7) What is the reseller price? (1.8) Do you have any package pricing for  purchasing multiple products at once? This type of grouping is the focus of this paper. We propose three grouping methods: similarity-based grouping, verb-based grouping and category-based grouping. The process of grouping semantic forms is domain dependent and it is difficult to come up with a generally applicable standard to judge whether a grouping is appropriate or not. Different grouping techniques can give programmers different views of their data in order to satisfy different goals.</Paragraph>
    <Paragraph position="7"> This paper is organized into 6 sections. In Section 2, we briefly describe the system for which the grouping algorithms are proposed and implemented. Section 3 presents the three grouping methods in detail. In Section 4, we describe how the algorithms are implemented in our system. We test the methods using a set a sentences from our corpus and discuss the pros and cons of each method in Section 5.</Paragraph>
    <Paragraph position="8"> Finally, in Section 6, we draw conclusions and propose some future work.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML