File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/metho/93/h93-1095_metho.xml
Size: 4,549 bytes
Last Modified: 2025-10-06 14:13:26
<?xml version="1.0" standalone="yes"?> <Paper uid="H93-1095"> <Title>Spoken Language Recognition and Understanding</Title> <Section position="1" start_page="0" end_page="0" type="metho"> <SectionTitle> 1. PROJECT GOALS </SectionTitle> <Paragraph position="0"> The goal of this research is to demonstrate spoken language systems in support of interactive problem solving.</Paragraph> <Paragraph position="1"> The MIT spoken language system combines SUMMIT, a segment-based speech recognition system, and TINA, a probabilistic natural language system, to achieve speech understanding. The system accepts continuous speech input and handles multiple speakers without explicit speaker enrollment. It engages in interactive dialogue with the user, providing output in the form of tabular and graphical displays, as well as spoken and written responses. We have demonstrated the system on several applications, including travel planning and direction assistance; it has also been ported to several languages, including Japanese and French.</Paragraph> </Section> <Section position="2" start_page="0" end_page="0" type="metho"> <SectionTitle> 2. RECENT RESULTS * Improved recognition and understanding: </SectionTitle> <Paragraph position="0"> Reduced word error rate by over 30% through the use of improved phonetic modeling and more powerful N-gram language models; improved language understanding by 35% making use of stable corpus of annotated data; other improvements include the ability to generate a word lattice.</Paragraph> <Paragraph position="1"> * Real-time~ software-only SLS system: Developed near (1.5 times) real-time software only version of SUMMIT, using MFCC and fast match in the mixture Gaussian computation, running on a DEC Alpha or an HP735 workstation.</Paragraph> <Paragraph position="2"> * Evaluation of interactive dialogue: Continued study of interactive dialogue, focusing on error detection and recovery issues; supported multi-site logfile evaluation through distribution of portable logfile evaluation software and instructions.</Paragraph> </Section> <Section position="3" start_page="0" end_page="0" type="metho"> <SectionTitle> * On-line ATIS: Applied spoken language technology </SectionTitle> <Paragraph position="0"> to access on-line dynamic air travel system via Compuserve; the demonstration system, extending the MIT ATIS system, provides an interactive language-based interface to find flights, make reservations and show seating assignments.</Paragraph> <Paragraph position="1"> * Multi-lingual VOYAGER: Ported SUMMIT and TINA to Japanese, to create a speaker-independent bilingual VOYAGER; English and Japanese use the same semantic frame representation and the generation mechanism is modular and languageindependent, supporting a system with independently toggled input and output languages.</Paragraph> <Paragraph position="2"> * Support to DARPA SLS community: Chaired the ISAT Study Group on Multi-Modal Language-Based Systems; continued to chair MADCOW co-ordinating multi-site data collection, including introduction of experimental end-to-end evaluation; chaired first Spoken Language Technology Workshop at MIT, Jan. 20-22, 1993.</Paragraph> </Section> <Section position="4" start_page="0" end_page="401" type="metho"> <SectionTitle> 3. FUTURE PLANS </SectionTitle> <Paragraph position="0"> * Large vocabulary spoken language systems: Explore realistic large vocabulary spoken language applications, (e.g., on-line air travel planning), including issues of system portability and language-based interface design.</Paragraph> <Paragraph position="1"> * Multilingual knowledge-base access: Use a uniform language-independent semantic frame to support extensions of VOYAGER and ATIS to other (more inflected) languages, e.g., French, German, Italian, and Spanish.</Paragraph> <Paragraph position="2"> * Interfacing speech and language: Investigate loosely and tightly coupled integration, using word lattice and TINA-2'S layered bigram model.</Paragraph> <Paragraph position="3"> * Dialogue modeling: Incorporate dialogue state-specific language models to improve recognition in interactive dialogue, collect and study data on human-human interactive problem-solving, and explore alternative generation and partial understanding strategies.</Paragraph> <Paragraph position="4"> * Language modeling: Investigate low-perplexity language models and the capture of higher level information, e.g., semantic class, phrase level information, and automatic grammar acquisition.</Paragraph> </Section> class="xml-element"></Paper>