XML Viewer - w97-0406

File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0406_intro.xml
Size: 3,639 bytes
Last Modified: 2025-10-06 14:06:22
<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0406">
  <Title>Dealing with Multilinguality in a Spoken Language Query &amp;quot; Translator</Title>
  <Section position="3" start_page="0" end_page="40" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> In the past few decades, automatic speech recognition (ASR) and machine translation (MT) have both undergone rapid technical progress. Spoken language translation has emerged as a new field combining the advances in ASR and MT(Levin et al., 1995; Mayfield et al., 1995; Lavie et al., 1995; Vilar et al., 1996). Robustness is a critical issue which must be addressed for this technology to be useful in real applications. There are several robustness issues arising from the multilingual characteristics of many spoken language translation systems which have not studied by the speech recognition community since the latter tends to focus on monolingual recognition systems.</Paragraph>
    <Paragraph position="1"> One problem in a multilingual system is accent variability. It is frequently assumed that the speakers using a system are native speakers belonging to the same accent group. However, this is not generally true. For example, in Hong Kong, although many people can speak English, one encounters a large variety of different accents since in addition to Hong Kong's large population of Cantonese speakers, there are also many Mandarin speakers and many Indian, British, American and Australian Hong Kong residents.</Paragraph>
    <Paragraph position="2"> Another problem with multilinguality is mixed language recognition. Although the official languages of Hong Kong are English, spoken Cantonese and written Mandarin, most Hong Kongers speak a hybrid of English and Cantonese. In fact, since many native Cantonese speakers do not know the Chinese translations of many English terms, forcing them to speak in pure Cantonese is impractical and unrealistic.</Paragraph>
    <Paragraph position="3"> A third problem is the complexity of the design of recognizers for multiple languages. Many large multilingual spoken language translation systems such as JANUS (Lavie et al., 1995) and the C-STAR Consortium decouple the development of speech recognition interfaces for different languages. However, for developers of a multilingual system at one single site, it would be more efficient if the speech interfaces for the different languages shared a common engine with one set of features, one set of parameters, one recognition algorithm and one system architecture, but differed in the parameter values used.</Paragraph>
    <Paragraph position="4"> We are studying the issues raised above in the domain of a traveling business-person's query translation system (Figure 1). This translator is a symmetrical query/response system. Both ends of the system recognize input speech from human through a common recognition engine comprising of either a concatenated or a mixed language recognizer. After the speech is decoded into text, the translator converts one language to another. Both ends of the system have a speech synthesizer for output speech.</Paragraph>
    <Paragraph position="5"> The domain of our system is restricted to points of interest to a traveling business-person, such as names and directions of business districts, conference centers, hotels, money exchange, restaurants. We are currently implementing such a system with Cantonese and English as the main languages. We  use HMM-based, isolated word recognition system as the recognition engine, and a statistical translator for the translation engine.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML