File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/e06-1046_concl.xml
Size: 1,878 bytes
Last Modified: 2025-10-06 13:55:07
<?xml version="1.0" standalone="yes"?> <Paper uid="E06-1046"> <Title>Edit Machines for Robust Multimodal Language Processing</Title> <Section position="9" start_page="366" end_page="367" type="concl"> <SectionTitle> 8 Conclusions </SectionTitle> <Paragraph position="0"> Robust understanding is a crucial feature of a practical conversational system whether spoken or multimodal. There have been two main approaches to addressing this issue for speech-only dialog systems. In this paper, we present an alternative approach based on edit machines that is more suitable for multimodal systems where generally very little training data is available and data is costly to collect and annotate. We have shown how edit machines enable integration of stochastic speech recognition with hand-crafted multi-modal understanding grammars. The resulting multimodal understanding system is significantly more robust 62% relative improvement in performance compared to 38.9% concept accuracy without edits. We have also presented an approach to learning the edit operations and a classification-based approach. The Learned edit approach provides a substantial improvement over the baseline, performing similarly to the Basic edit machine, but does not perform as well as the applicationtuned Smart edit machine. Given the small size of the corpus, the classification-based approach performs less well. This leads us to conclude that given the lack of data for multimodal applications a combined strategy may be most effective. Multimodal grammars coupled with edit machines derived from the underlying application database can provide sufficiently robust understanding performance to bootstrap a multimodal service and as more data become available data-driven techniques such as Learned edit and the classification-based approach can be brought into play.</Paragraph> </Section> class="xml-element"></Paper>