File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/02/w02-0708_intro.xml

Size: 2,298 bytes

Last Modified: 2025-10-06 14:01:30

<?xml version="1.0" standalone="yes"?>
<Paper uid="W02-0708">
  <Title>Balancing Expressiveness and Simplicity in an Interlingua for Task Based Dialogue</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Several speech translation projects have chosen interlingua-based approaches because of its convenience (especially in adding new languages) in multi-lingual projects. However, interlingua design is notoriously di cult and inexact. The main challenge is deciding on the grain size of meaning to represent and what facets of meaning to include. This may depend on the domain and the contexts in which the translation system is used. For projects that take place at multiple research sites, another factor becomes important in interlingua design: if the interlingua is too complex, it cannot be used reliably by researchers at remote sites. Furthermore, the interlingua should not be biased toward one family of languages. Finally, an interlingua should clearly distinguish general and domain speci c components for easy scalability and portability between domains.</Paragraph>
    <Paragraph position="1"> Sections 2 and 3 describe how we balanced the factors of grain-size, language independence, and simplicity in two interlinguas for speech translation projects  |the C-star II Interchange Format (Levin et al., 1998) and the Nespole Interchange Format. Both interlinguas are based in the framework of domain actions as described in (Levin et al., 1998). We will show that the Nespole interlingua has a ner grain-size of meaning, but is still simple enough for collaboration across multiple research sites, and still maintains language-independence.</Paragraph>
    <Paragraph position="2"> Section 4 will address the issue of scalability of interlinguas based on domain actions to larger domains. The basis of Section 4 is a distributional analysis of the C-star II and Nespole databases tagged with interlingua representations. The C-star II database has been partially re-tagged with the Nespole interlingua, which enables us to make comparisons on the same data with two types of interlinguas and on two types of data (C-star II and Nespole) withthesametypeofinterlingua.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML