File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/00/c00-2171_intro.xml
Size: 3,212 bytes
Last Modified: 2025-10-06 14:00:53
<?xml version="1.0" standalone="yes"?> <Paper uid="C00-2171"> <Title>Incorporating Metaphonemes in a Multilingual Lexicon</Title> <Section position="2" start_page="0" end_page="1126" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> This paper describes a framework for multilingual inheritance-based lexical representation which aP lows sharing of information across (related) hmguages at all levels of linguistic description. Most work on multilingual lexicons up to now has assumed mouolingual lexicons linked only at the level of semantics (MUI_TILEX 1993; Copestake et al.</Paragraph> <Paragraph position="1"> 1992). Cahill and Gazdar (1999) show that this approach might be appropriate for unrelated languages, as for example English and Japanese, but that it makes it impossible to capture useful generalisations about related languages - such as English and German. Related languages share many linguistic characteristics at all levels of description - syntax, morphology, phonology, etc. - not just semantics. For instance, words which come fl'om a single root have very similar orthographic and phonological forms. Compare English, Dutch, and German1: IThe lranscriptions are taken from CELEX (Baayen et al.</Paragraph> <Paragraph position="2"> 1995) and use tile SAMPA phonetic alphabet (Wells 1989).</Paragraph> <Paragraph position="3"> Most differences can be attributed to dil'ferent orthographic conventions and regular phonological changes (e.g. final devoicing in Dutch and German).</Paragraph> <Paragraph position="4"> The English/{I, the Dutch/AI, and the German/a/ in the last two exmnples, are even virtually the same.</Paragraph> <Paragraph position="5"> They have slightly different realisations but they are phonologically non-distinctive, i.e. if the Dutch/A/ were substituted by the English/{/in Dutch, the result would not be a different word, but it would simply sound like a different accent.</Paragraph> <Paragraph position="6"> Cahill and Gazdar (I 999) describe an architecture for nmltilingual lexicons which aims to encode and exploit lexical similarities between closely related languages. This architecture has been successfully applied in the PolyLex project 2 to define a trilingual lexicon for Dutch, English, and German sharing morphological, phonological, and lnorphophonological information between these languages. in this paper, we will take the Polykex fiamework as our basis. We will focus on the phonological similarities between related hmguages and we will extend the PolyLex approach by capturing cross-linguistic phoneme correspondences, such as the/{/-/A/-/a/correspondence mentioned above 3.</Paragraph> <Paragraph position="7"> First, we will discuss how a phoneme inventory can be defined for a group of languages - l)utch, exlended to a featural level, but for tile present purposes we conline ourselves to the segmental level.</Paragraph> <Paragraph position="8"> English, and German. Then, we will explain tile multilingual architecture used in PolyLex. Finally, we will explore how these cross-linguistic phoneme correspoudences can be integrated into tile multilingual frmnework.</Paragraph> </Section> class="xml-element"></Paper>