File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2117_intro.xml

Size: 4,540 bytes

Last Modified: 2025-10-06 14:02:45

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2117">
  <Title>Language Resources for a Network-based Dictionary</Title>
  <Section position="2" start_page="0" end_page="0" type="intro">
    <SectionTitle>
1 Introduction
</SectionTitle>
    <Paragraph position="0"> Traditional dictionaries are mainly used for language reception, even though they were also developed to be used for language production. However the form-based structure following orthographic conventions which could also be called &amp;quot;one-dimensional&amp;quot;, makes it difficult to access the information by meaning. Therefore the usage of a traditional dictionary for text production is quite limited as opposed to, for example, a thesaurus.</Paragraph>
    <Paragraph position="1"> The main advantage of a thesaurus is the structuring based on the semantic relation between words in an entry. This allows for the availability of a different type of information.</Paragraph>
    <Paragraph position="2"> Therefore our proposal is to construct an electronic dictionary which has a network-like structure and whose content is drawn from various existing lexical resources. The dictionary will represent both paradigmatic information - information about the various ways in which words are similar - as well as syntagmatic information - information about the relationships among words that appear together. Additionally information from other types of resources such as morphology and phonology will be integrated as they are also relevant in models of the mental lexicon. In these models &amp;quot;associations&amp;quot; between words are based not only on meaning but also on phonological or morphological properties of the connected words. Following Brown and McNeill (1966) and subsequent research people in the so-called &amp;quot;tip-of-the-tongue&amp;quot;-state (TOT-state) are able to clearly recall the properties of the missing word such as the number of syllables or the meaning, and can easily identify the target word when it is presented to them.</Paragraph>
    <Paragraph position="3">  node with related information from two LRs1. Here a user would be able to find the term &amp;quot;shortcake&amp;quot; even if s/he only knows only one part, namely strawberries.2 A click on a neighbouring node should e.g. re-center the structure and hence allow the user to &amp;quot;explore&amp;quot; the network.</Paragraph>
    <Paragraph position="4"> As mentioned above the most obvious usage seems to be in language production where information can be provided not only for words already activated in the mind of the language producer but also for alternatives, specifications or for words not directly accessible because of a TOT-state. This seems reasonable in light of the fact that speaker's passively vocabularies are known to be larger than  &amp;quot;shortcake&amp;quot; however lists specifically &amp;quot;strawberry shortcake&amp;quot;. their active vocabularies. The range of information available of course depends on the material integrated into the dictionary from the various resources which are explored more closely below.</Paragraph>
    <Paragraph position="5"> A second area of application of such a dictionary is language learning. Apart from specifying paradigmatic information which is usually also part of the definition of a lemma, syntagmatic information representing collocations and cooccurrances is an important resource for language learners. Knowledge about collocations is a kind of linguistic knowledge which is language-specific and not systematically derivable making collocations especially difficult to learn.</Paragraph>
    <Paragraph position="6"> Even though there are some studies that compare the results from statistically computed association measures with word association norms from psycholinguistic experiments (Landauer et al., 1998; Rapp, 2002) there has not been any research on the usage of a digital, network-based dictionary reflecting the organisation of the mental lexicon to our knowledge. Apart from studies using so called Mind Maps or Concept Maps to visualize &amp;quot;world knowledge&amp;quot;3 (Novak, 1998) nothing is known about the psycholinguistic aspects which need to be considered for the construction of a network-based dictionary. null In the following section we will summarize the information made available by the various LRs we plan to integrate into our system. The ideas presented here were developed in preparation of a project at the University of Osnabr&amp;quot;uck.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML