File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/01/h01-1071_intro.xml
Size: 5,007 bytes
Last Modified: 2025-10-06 14:01:05
<?xml version="1.0" standalone="yes"?> <Paper uid="H01-1071"> <Title>Towards Automatic Sign Translation</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1. INTRODUCTION </SectionTitle> <Paragraph position="0"> Languages play an important role in human communication.</Paragraph> <Paragraph position="1"> We communicate with people and information systems through diverse media in increasingly varied environments.</Paragraph> <Paragraph position="2"> One of those media is a sign. A sign is something that suggests the presence of a fact, condition, or quality. Signs are everywhere in our lives. They make our lives easier when we are familiar with them. But sometimes they also pose problems. For example, a tourist might not be able to understand signs in a foreign country. Unfamiliar language and environment make it difficult for international tourists to read signs, take a taxi, order food, and understand the comments of passersby.</Paragraph> <Paragraph position="3"> At the Interactive Systems Lab of Carnegie Mellon University, we are developing technologies for tourist applications [12]. The systems are equipped with a unique combination of sensors and software. The hardware includes computers, GPS receivers, lapel microphones and earphones, video cameras and head-mounted displays. This combination enables a multimodal interface to take advantage of speech and gesture inputs to provide assistance for tourists. The software supports natural language processing, speech recognition, machine translation, handwriting recognition and multimodal fusion. A vision module is trained to locate and read written language, is able to adapt to new environments, and is able to interpret intentions offered by the user, such as a spoken clarification or pointing gesture.</Paragraph> <Paragraph position="4"> In this paper, we present our efforts towards automatic sign translation. A system capable of sign detection and translation would benefit three types of individuals: tourists, the visually handicapped and military intelligence. Sign translation, in conjunction with spoken language translation, can help international tourists to overcome these barriers. Automatic sign recognition can help us to increase environmental awareness by effectively increasing our field of vision. It can also help blind people to extract information. A successful sign translation system relies on three key technologies: sign extraction, optical character recognition (OCR), and language translation. Although much research has been directed to automatic speech recognition, handwriting recognition, OCR, speech and text translation, little attention has been paid to automatic sign recognition and translation in the past. Our current research is focused on automatic sign detection and translation while taking advantage of OCR technology available. We have developed robust automatic sign detection algorithms. We have applied Example Based Machine Translation (EBMT) technology [1] in sign translation.</Paragraph> <Paragraph position="5"> Fully automatic extraction of signs from the environment is a challenging problem because signs are usually embedded in the environment. Sign translation has some special problems compared to a traditional language translation task. They can be location dependent. The same text on different signs can be treated differently. For example, it is not necessary to translate the meanings for names, such as street names or company names, in most cases. In the system development, we use a user-centered approach. The approach takes advantage of human intelligence in selecting an area of interest and domain for translation if needed. For example, a user can determine which sign is to be translated if multiple signs have been detected within the image. The selected part of the image is then processed, recognized, and translated, with the translation displayed on a hand-held wearable display, or a head mounted display, or synthesized as a voice output message over the earphones. By focusing only on the information of interest and providing domain knowledge, the approach provides a flexible method for sign translation. It can enhance the robustness of sign recognition and translation, and speed up the recognition and translation process. We have developed a prototype system that can recognize Chinese sign input from a video camera which is a common gadget for a tourist, and translate the signs into English text or voice stream.</Paragraph> <Paragraph position="6"> The organization of this paper is as follows: Section 2 describes challenges in sign recognition and translation.</Paragraph> <Paragraph position="7"> Section 3 discusses methods for sign detection. Section 4 addresses the application of EBMT technology into sign translation. Section 5 introduces a prototype system for Chinese sign translation. Section 6 gives experimental results. Section 7concludes the paper.</Paragraph> </Section> class="xml-element"></Paper>