File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/e06-1046_evalu.xml

Size: 3,881 bytes

Last Modified: 2025-10-06 13:59:33

<?xml version="1.0" standalone="yes"?>
<Paper uid="E06-1046">
  <Title>Edit Machines for Robust Multimodal Language Processing</Title>
  <Section position="8" start_page="366" end_page="366" type="evalu">
    <SectionTitle>
7 Experiments and Results
</SectionTitle>
    <Paragraph position="0"> To evaluate the approach, we collected a corpus of multimodal utterances for the MATCH domain in a laboratory setting from a set of sixteen first time users (8 male, 8 female). A total of 833 user interactions (218 multimodal / 491 speech-only / 124 pen-only) resulting from six sample task scenarios were collected and annotated for speech transcription, gesture, and meaning (Ehlen et al., 2002). These scenarios involved finding restaurants of various types and getting their names, phone numbers, addresses, or reviews, and getting subway directions between locations. The data collected was conversational speech where the users gestured and spoke freely.</Paragraph>
    <Paragraph position="1"> Since we are concerned here with editing errors out of disfluent, misrecognized or unexpected speech, we report results on the 709 inputs that involve speech (491 unimodal speech and 218 multimodal). Since there are only a small number of scenarios performed by all users, we partitioned the data six ways by scenario. This ensures that the specific tasks in the test data for each partition are not also found in the training data for that partition. For each scenario we built a class-based trigram language model using the other five scenarios as training data. Averaging over the six partitions, ASRsentence accuracy was49% and word accuracy was 73.4%.</Paragraph>
    <Paragraph position="2"> In order to evaluate the understanding performance of the different edit machines, for each partition of the data we first composed the output from speech recognition with the edit machine and the multimodal grammar, flattened the meaning representation (as described in Section 3.1), and computed the exact string match accuracy betweentheflattened meaning representation andthe reference meaning representation. We then averaged this concept sentence accuracy measure over all six partitions.</Paragraph>
    <Paragraph position="3">  The results are tabulated in Figure 10. The columns show the concept sentence accuracy (ConSentAcc) and the relative improvement over the the baseline of no edits. Compared to the base-line of 38.9% concept sentence accuracy without edits (No Edits), Basic Edit gave a relative improvement of 32%, yielding 51.5% concept sentence accuracy. 4-edit further improved concept sentence accuracy (53%) compared to Basic Edit.</Paragraph>
    <Paragraph position="4"> The heuristics in Smart Edit brought the concept sentence accuracy to 60.2%, a 55% improvement over the baseline. Applying Smart edit to lattice input improved performance from 60.2% to 63.2%.</Paragraph>
    <Paragraph position="5"> The MT-based edit model yielded concept sentence accuracy of 51.3% a 31.8% improvement over the baseline with no edits, but still substantially less than the edit model derived from the application database. We believe that given the lack of data for multimodal applications that an approach that combines the two methods may be most effective.</Paragraph>
    <Paragraph position="6"> TheClassification approach yielded only 34.0% concept sentence accuracy. Unlike MT-based edit this approach does not have the benefit of composition with the grammar to guide the understanding process. The low performance of the classifier is most likely due to the small size of the corpus. Also, since the training/test split was by scenario the specifics of the commands differed between training and test. In future work will explore the use of other classification techniques and try combining the annotated data with the grammar for training the classifier model.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML