File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2145_intro.xml
Size: 10,224 bytes
Last Modified: 2025-10-06 14:00:51
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2145"> <Title>A Model of Competence for Corpus-Based Machine Translation</Title> <Section position="2" start_page="0" end_page="998" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> In the machine translation (MT) literature, it has often been argued that translations of natural language texts are valid if and only if the source language text and the target language text have the same meaning cf. e.g. (Nagao, 1989). If we assume that MT systems produce meaningflfl translations to a certain extent, wc must assmne that such systems have a notion of the source text meaning to a similar extent. Hence, the translation algorithm together with the data it uses encode a formal model of meaning. Despite 50 years of intense research, there is no existing system that could map arbitrary input texts onto meaning-equivalent output texts. How is that possible? According to (Dummett, 1975) a theory of meanlug is a theory of understanding: having a theory of meaning means that one has a theory of understanding. In linguistic research, texts are described on a number of levels and dimensions each contributing to its understanding and hence to its memfing. l~raditionally, the main focus has been on semantic aspects. In this research it is assumed that knowing the propositional structure of a text means to understand it. Under the same premise, research in M.q? has focused on semantic aspects assmning that texts have the same meaning if they are semantically equivalent.</Paragraph> <Paragraph position="1"> Recent research in corpus-based MT has different premisses. Corpus-Based Machine Translation (CBMT) systems make use of a set of reference translations on which the translation of a new text is based. In CBMT-systems, it is assumed that the reference translations given to the system in a training phase have equivalence meanings. According to their intelligence, these systems try to figurc out of what the meaning invariance consists in the reference text and learn an appropriate source language/target language mapping mechanism. A translation can only be generated if an appropriate example translation is available in the reference text. An interesting question in CBMT systems is thus: what theory of meaning should the learning process implement in order to generate an appropriate understanding of the source text such that it can be mapped iuto a meaning equivalent target text? Dulmnett (Dummett, 1975) suggests a distinction of theories of meaning along the following lines: * In a rich theory of meaning, the knowledge of the concepts is achieved by knowing the features of these concepts. An ausle'ce theory merely relies upon simple recognition of the shape of the concepts. A rich theory can justify the use of a concept by means of the characteristic features of that concept, whereas an austere theory can justify the use of a concept merely by enmnerating all occurrences of the use of that concept. . A moh'.euIar theory of meaning derives the understanding of an expression from a finite number of axioms. A holistic theory, in contrast, derives the understanding of an expression through its distinction from all other expressions in that language. A molecular theory, therefore, provides criteria to associate a certain meaning to a sentence and can explain the concepts used in the language. In a holistic theory nothing is specified about the knowledge of the language other than in global constraints related to the language as a whole.</Paragraph> <Paragraph position="2"> In addition, the granularity of concepts seems crucial for CBMT implementations.</Paragraph> <Paragraph position="3"> * A fine-grained theory of meaning derives concepts from single morphemes or separable words of the language, whereas in a coar~'e-qrained theory of meaning, concepts are obtained from morpheme clusters. In a fine-grained theory of meaning, complex concepts can be created by hierarchical composition of their components, whereas in a coarse-grained theory of meaning, complex meanings can only be achieved through a concatenation of concept sequences.</Paragraph> <Paragraph position="4"> The next three sections discuss the dichotomies of theories of nleaning, rich 'vs. auz~ere, molecular vs. holis*ic and coarse-grained vs. fine-grained where a few CBMT systems are classified according to the terminology introduced. This leads to a model of competence for CBMT. It appears that translation systems can either be designed to have a broad coverage or a high quali@.</Paragraph> <Paragraph position="5"> 2 Rich vs. Austere CBMT A common characteristic of all CBMT systems is that the understanding of the translation task is derived fronl the understanding of the reference translations. The inferred translation knowledge is used in the translation phase to generate new translations. null Collins (1998) distinguishes between Memory-Based MT, i.e. menlory heavy, linguistic light and Example-Based MT i.e. memory light and linguistic heavy. While the former systems implement an austere theory of meaning, the latter make use of rich representations.</Paragraph> <Paragraph position="6"> The most superficial theory of understanding is implenlented in purely menlory-based MT approaches where learning takes place only by extending the reference text. No abstraction or generalization of the reference examples takes place. Translation Memories (TMs) are such purely memory based MT-systems. A TM e.g. TRADOS's Translator's Workbench (Heyn, 1996), and STAR's TRANSIT calculates the graphenfic similarity of the input text and the source side of the reference translations and return the target string of the nlost similar translation examples as output. TMs make use of a set of reference translation examples and a (knn) retrieval algorithm. They iulplement an austere theory of nleaning because they cannot justify the use of a word other than by looking up all contexts in which the word occurs. They can, however, enumerate all occurrences of a word in the reference text.</Paragraph> <Paragraph position="7"> The TM distributed by ZERES (Zer, 1997) follows a richer approach. The reference translations and the input sentence to be translated are lemmatized and part-of-speech tagged. The source language sentence is nlapped against the reference translations on a surface string level, on a lemma level and on a part-of-speech level. Those example translations which show greatest similarity to the input sentence with respect to the three levels of description are returned as the best available translation.</Paragraph> <Paragraph position="8"> Example Based Machine Translation (EBMT) systems (Sato and Nagao, 1990; Collins, 1998; Gilvenir and Cicekli, 1998; Carl, 1999; Brown, 1997) are richer systems. Translation examples are stored as feature and tree structures. Translation tenlplates are generated which contain - SOuletinles weighted - connections in those positions where the source language and the target language equivalences are strong. In the translation phase, a multi-layered mapping from the source language into the target language takes place on the level of templates and on the level of fillers.</Paragraph> <Paragraph position="9"> The ReVerb EBMT system (Collins, 1998) performs sub-sentential chunking and seeks to link constituents with the same function in the source and the target language. A source language subject is translated as a target language subject and a source language object as a target language object. In case there is no appropriate translation template available, single words can be replaced as well, at the expense of translation quality.</Paragraph> <Paragraph position="10"> The EBMT approach described in (Giivenir and Cicekli, 1998) makes use of morphological knowledge and relies on word stems as a basis for translation. Translation templates are generalized fronl aligned sentences by substituting differences in sentence pairs with variables aud leaving the identical substrings unsubstituted. An iterative application of this nlethod generates translation examples and translation templates which serve as the basis for an example based MT system. An understanding consists of extraction of compositionally translatable substriugs and the generation of translation templates. null A similar approach is followed in EDGAR (Carl, 1999). Sentences are morphologically analyzed and translation templates are decorated with features.</Paragraph> <Paragraph position="11"> Fillers in translation template slots are constrained to unify with these features. In addition to this, a shallow linguistic formalism is used to percolate features in derivation trees.</Paragraph> <Paragraph position="12"> Sato and Nagao (1990) proposed still richer representations where syntactically analyzed phrases and sentences are stored in a database. In the translation phase, most similar derivation trees are retrieved from the database and a target language derivation tree is conlposed fronl the translated parts. By means of a thesaurus semantically similar lexical items may be exchanged in the derivation trees.</Paragraph> <Paragraph position="13"> Statistics based MT (SBMT) approaches implement austere theories of lncaning. For instance, in Brown et al. (1990) a couple of models are presented starting with simple stochastic translation models getting incrementally more complex and rich by introducing more random variables. No linguistic analyses are taken into account in these approaches.</Paragraph> <Paragraph position="14"> However, in further research the authors plan to integrate linguistic knowledge such as inflectional analysis of verbs, nouns and adjectives.</Paragraph> <Paragraph position="15"> McLean (McLean, 1992) has proposed an austere approach where lie uses neural networks (NN) to translate surface strings from English to French. His approach functions similar to TM where the NN is u,;ed to classify the sequences of surface word forms according to the examples given in the reference translations. On a small set of examples hc shows that NN can successfully be applied for MT.</Paragraph> </Section> class="xml-element"></Paper>