File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/04/w04-2105_intro.xml
Size: 4,640 bytes
Last Modified: 2025-10-06 14:02:39
<?xml version="1.0" standalone="yes"?> <Paper uid="W04-2105"> <Title>Word lookup on the basis of associations: from an idea to a roadmap</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> We all experience now and then the problem of being unable to find the word expressing the idea we have in our mind. It we care and have time we may reach for a dictionary. Yet, this kind of resource may be of little help, if it expects from us precisely what we are looking for : a perfectly spelled word, expressing the idea we try to convey. While perfect input may be reasonable in the case of analysis (comprehension), it certainly is not in the case of synthesis (generation) where the starting point is conceptual in nature: a message, the (partial) definition of a word, a concept or a word related to the target word. The language producer needs a dictionary allowing for reverse access. A thesaurus does that, but only in a very limited way: the entry points are basically topical.</Paragraph> <Paragraph position="1"> People use various methods to initiate search in their mind : words, concepts, partial descriptions, etc. If we want to mimic these functionalities by a computer, we must build the resource accordingly. Let us assume that the text producer is looking for a word that he cannot access. Instead he comes up with another word (or concept)1 somehow related to the former.</Paragraph> <Paragraph position="2"> He may not know precisely how the two relate, but he knows that they are related. He may also know to some extent how close their relationship is, whether a given link is relevant or not, that is, whether it can lead directly (synonym, 1We will comment below on the difference between concepts and words.</Paragraph> <Paragraph position="3"> antonym, hyperonym) or indirectly to the target word. Since the relationship between the source- and the target word is often indirect, several lookups may be necessary: each one of them having the potential to contain either the target word (direct lookup), or a word leading towards it (indirect lookup).</Paragraph> <Paragraph position="4"> 2 How reasonable is it to expect perfect input? The expectation of perfect input is unrealistic even in analysis,2 but clearly more so in generation. The user may well be unable to provide the required information: be it because he cannot access in time the word he is looking for, even though he knows it,3 or because he does not know the word yet expressing the idea he wants to convey. This latter case typically occurs when using a foreign language or when trying to use a very technical term. Yet, not being able to find a word, does not imply that one does not know anything concerning the word.</Paragraph> <Paragraph position="5"> Actually, quite often the contrary is the case.</Paragraph> <Paragraph position="6"> Suppose, you were looking for a word expressing the following ideas: domesticated animal, producing milk suitable for making cheese. Suppose further that you knew that the target word was neither cow nor sheep. While none of this information is sufficient to guarantee the access of the intended word goat, the information at hand (part of the definition) could certainly be used. For some concrete proposals going in this direction, see (Bilac et al., 2004), or the OneLook reverse dictionary.4 Besides the definition information, people often have other kind of knowledge concerning the target word.</Paragraph> <Paragraph position="7"> In particular, they know how the latter relates to other words. For example, they know that goats and sheep are somehow connected, that both of them are animals, that sheep are appreciated for their wool and meet, that sheep tend to follow each other blindly, while goats manage to survive, while hardly eating anything, etc. In sum, people have in their mind lexical networks: all words, concepts or ideas they express are highly interconnected. As a result, any one of the words or concepts has the potential to evoke each other. The likelihood for 2Obviously, looking for &quot;pseudonym&quot; under the letter &quot;S&quot; in a dictionary won't be of great help. shtml this to happen depends, among other things, on such factors as frequency (associative strength), saliency and distance (direct vs. indirect access). As one can see, associations are a very general and powerful mechanism. No matter what we hear, read or say, any idea is likely to remind us of something else.5 This being so, we should make use of it.6</Paragraph> </Section> class="xml-element"></Paper>