File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/06/p06-1036_intro.xml
Size: 5,956 bytes
Last Modified: 2025-10-06 14:03:34
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1036"> <Title>Enhancing electronic dictionaries with an index based on associations</Title> <Section position="3" start_page="0" end_page="281" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> A dictionary user typically pursues one of two goals (Humble, 2001): as a decoder (reading, listening), he may look for the definition or the translation of a specific target word, while as an encoder (speaker, writer) he may want to find a word that expresses well not only a given concept, but is also appropriate in a given context.</Paragraph> <Paragraph position="1"> Obviously, readers and writers come to the dictionary with different mindsets, information and expectations concerning input and output.</Paragraph> <Paragraph position="2"> While the decoder can provide the word he wants additional information for, the encoder (language producer) provides the meaning of a word for which he lacks the corresponding form. In sum, users with different goals need access to different indexes, one that is based on form (decoding), In alphabetical order the other being based on meaning or meaning relations (encoding).</Paragraph> <Paragraph position="3"> Our concern here is more with the encoder, i.e. lexical access in language production, a feature largely neglected in lexicographical work. Yet, a good dictionary contains not only many entries and a lot of information concerning each one of them, but also efficient means to reveal the stored information. Because, what is a huge dictionary good for, if one cannot access the information it contains? 2 Lexical access on the basis of what: concepts (i.e. meanings) or words? Broadly speaking, there are two views concerning lexicalization: the process is conceptually-driven (meaning, or parts of it are the starting point) or lexically-driven : the target word is accessed via a source word. This is typically the case when we are looking for a synonym, antonym, hypernym (paradigmatic associations), or any of its syntagmatic associates (red-rose, coffee-black), the kind of association we will be concerned with here.</Paragraph> <Paragraph position="4"> Yet, besides conceptual knowledge, people seem also to know a lot of things concerning the lexical form (Brown and Mc Neill, 1966): number of syllables, beginning/ending of the target word, part of speech (noun, verb, adjective, etc.), origin (Greek or Latin), gender (Vigliocco et al., Of course, the input can also be hybrid, that is, it can be composed of a conceptual and a linguistic component. For example, in order to express the notion of intensity, MAGN in Mel'Auk's theory (Mel'Auk et al., 1995), a speaker or writer has to use different words (very, seriously, high) depending on the form of the argument (ill, wounded, price), as he says very ill, seriously wounded, high price. In each case he expresses the very same notion, but by using a different word. While he could use the adverb very for qualifying the state of somebody's health (he is ill), he cannot do so when qualifying the words injury or price. Likewise, he cannot use this specific adverb to qualify the noun illness.</Paragraph> <Paragraph position="5"> 1997). While in principle, all this information could be used to constrain the search space, we will deal here only with one aspect, the words' relations to other concepts or words (associative knowledge).</Paragraph> <Paragraph position="6"> Suppose, you were looking for a word expressing the following ideas: domesticated animal, producing milk suitable for making cheese. Suppose further that you knew that the target word was neither cow, buffalo nor sheep. While none of this information is sufficient to guarantee the access of the intended word goat, the information at hand (part of the definition) could certainly be used . Besides this type of information, people often have other kinds of knowledge concerning the target word. In particular, they know how the latter relates to other words. For example, they know that goats and sheep are somehow connected, sharing a great number of features, that both are animals (hypernym), that sheep are appreciated for their wool and meat, that they tend to follow each other blindly, etc., while goats manage to survive, while hardly eating anything, etc. In sum, people have in their mind a huge lexico-conceptual network, with words , concepts or ideas being highly interconnected.</Paragraph> <Paragraph position="7"> Hence, any one of them can evoke the other. The likelihood for this to happen depends on such factors as frequency (associative strength), saliency and distance (direct vs. indirect access). As one can see, associations are a very general and powerful mechanism. No matter what we hear, read or say, anything is likely to remind us of something else. This being so, we should make use of it.</Paragraph> <Paragraph position="8"> Of course, one can question the very fact that people store words in their mind. Rather than considering the human mind as a wordstore one might consider it as a wordfactory. Indeed, by looking at some of the work done by psychologists who try to emulate the mental lexicon (Levelt et al., 1999) one gets the impression that words are synthesized rather than located and call up. In this case one might conclude that rather than having words in our mind we have a set of highly distributed, more or less abstract information. By propagating energy rather than data --(as there is no message passing, transformation or cumulation of information, there is only activation spreading, that is, changes of energy levels, call it weights, electronic impulses, or whatever),-- that we propagate signals, activating ultimately certain peripherical organs (larynx, tongue, mouth, lips, hands) in such a way as to produce movements or sounds, that, not knowing better, we call words.</Paragraph> </Section> class="xml-element"></Paper>