File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/96/c96-1075_concl.xml
Size: 2,515 bytes
Last Modified: 2025-10-06 13:57:33
<?xml version="1.0" standalone="yes"?> <Paper uid="C96-1075"> <Title>Multi-lingual Translation of Spontaneously Spoken Language in a Limited Domain</Title> <Section position="8" start_page="445" end_page="445" type="concl"> <SectionTitle> 7 Conclusions and Future Work </SectionTitle> <Paragraph position="0"> In this paper we described the design of the two translation modules used in the .JANUS system, outlined their strengths and weaknesses and described our etforts to combine the two approaches.</Paragraph> <Paragraph position="1"> A newly developed end-to-end evaluation procedure allows us to assess the overall performance of the system using each of the translations methods separately or both combined.</Paragraph> <Paragraph position="2"> Our evaluations have confirmed that the GLR approach provides more accurate translations, while the Phoenix approach is more robust. Combining the two approaches using the parse quality judgement of the (ILl{* parser results in improved performance. We are currently investigating other methods for combining the two translation approaches. Since (\]LR* performs much better when long utterances are broken into sentences or sub-utterances which are parsed separately, we are looking into the possibility of using Phoenix to detect such boundaries. We are also developing a parse quality heuristic for the Phoenix parser using statistical and other methods.</Paragraph> <Paragraph position="3"> Another active research topic is the automatic detection of out-of-domain segments and utterances. Our experience has indicated that a large proportion of bad translations arise from the translation of small parsable fragments within out-of-domain phrases. Several approaches are nnder consideration. For the Phoenix parser, we have implemented a simple method that looks for small islands of parsed words among non-parsed words and rejects them. On a recent test set, we achieved a 33% detection rate of out-of-domain parses with no false alarms. Another approach we are pursuing is to use word salience measures to identify and reject out-of-domain segments.</Paragraph> <Paragraph position="4"> We are also working on tightening the coupling of the speech recognition and translation modules of our system. We are developing lattice parsing versions of both the GLR* and Phoenix parsers, so that multiple speech hypotheses can be efficiently analyzed in parallel, in search of an interpretation that is most likely to be correct.</Paragraph> </Section> class="xml-element"></Paper>