File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/03/w03-0707_metho.xml
Size: 5,254 bytes
Last Modified: 2025-10-06 14:08:28
<?xml version="1.0" standalone="yes"?> <Paper uid="W03-0707"> <Title>Flexible and Personalizable Mixed-Initiative Dialogue Systems</Title> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> 2 Underlying Technologies </SectionTitle> <Paragraph position="0"> Over the past several years, we have been making advances on several fronts, directed toward the larger goal of the vision outlined above. In this section, we will highlight some of these, with pointers to the literature for an in-depth description.</Paragraph> <Paragraph position="1"> SpeechBuilder: Over the past few years, we have been developing a set of utilities that would enable research results to be migrated directly into application development (Glass and Weinstein, 2001). Our goal is to enable natural, mixed-initiative interfaces similar to those now created manually by a relatively small group of expert developers. We make no distinction between the technology components of SpeechBuilder and those of our most sophisticated dialogue systems, such as the Mercury flight reservation domain (Seneff and Polifroni, 2000). SpeechBuilder employs a Web-based interface where developers type in the specifics of their domain, guided by forms and pull-down menus. Components such as recognition vocabulary, parse rules, and semantic mappings are created automatically from example sentences entered by the developer. In several recent short courses, naive developers have been able to implement a new domain and converse with it on the telephone in a matter of hours.</Paragraph> <Paragraph position="2"> Language Modelling: Patchwork Grammars Aserious limitation in today's technology to immediate deployment of a new system is the chicken-and-egg problem of the language model. System performance is critically tied to the quality of the statistical language model, which typically depends on large domain-dependent corpora that don't exist until the domain is actually deployed and widely used. We have initiated an effort to automatically induce a grammar for a new domain from related content of existing speech corpora for other domains combined with knowledge derived from the content provider for the new domain. For instance, our hotel domain can leverage from an existing auto classified domain to extract patterns for referring to prices, can induce a grammar for dates from a flight domain, and can make use of statistics of hotel counts to determine city probabilities. Parse rules for general sub-domains such as dates, times, and prices are organized into sub-grammars that are easily embedded into any application, along with libraries for converting the resulting meaning representations into a canonical format, such as &quot;27SEP2003.&quot; Flexible Vocabulary: We have recently realized our goal of enabling users to automatically add a new word to an existing system through natural interaction with the system itself (Schalkwyk et al., 2003; Seneff et al., 1998; Chung et al., 2003; Chung and Seneff, 2002; Seneff et al., 2003). We have thus far applied this only to the enrollment of the user's name as part of a personalization phase (Seneff et al., 1998; Chung et al., 2003), through a &quot;speak and spell&quot; mode. After confirmation, the system reconfigures itself to fully support the word such that it can now be understood in subsequent dialogue. A high quality sound-to-letter framework (Chung et al., 2003) and a new ability to automatically derive a class n-gram from an NL grammar have facilitated this process (Seneff et al., 2003). The recognizer update is currently implemented via full recompilation, which can take up to a minute of elapsed time, but efforts to support incremental recognizer updates (Schalkwyk et al., 2003) hold promise for essentially instantaneous new word addition.</Paragraph> <Paragraph position="3"> Managing the Dialogue: One of the most time consuming aspect of dialogue system development today is the implementation of the dialogue manager. To reduce this development phase, we have been creating a set of domain-independent functions that can be specialized to a particular domain through passed parameters.</Paragraph> <Paragraph position="4"> These functions perform such tasks as checking a query for completeness, filtering the database results on user-specified constraints, or making decisions on fuzzy attributes such as &quot;near&quot; (Polifroni and Chung, 2002). One common but important subgoal in dialogue planning is to generate a succinct description of a set of retrieved entries. Our recent research in this area has focused on organizing database retrievals into a summary meaning representation, by automatically clustering sets into natural groupings. In parallel, we are developing generation tools that will translate these summaries into fluent English. For instance, in the hotel domain, the result set is automatically partitioned into &quot;cheap&quot; or &quot;expensive&quot; differently depending upon the city. By basing such subjective categories on a content provider, we alleviate the burden of the system developer, while at the same time producing a more intelligent system.</Paragraph> </Section> class="xml-element"></Paper>