File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/97/w97-0805_intro.xml
Size: 3,736 bytes
Last Modified: 2025-10-06 14:06:28
<?xml version="1.0" standalone="yes"?> <Paper uid="W97-0805"> <Title>Lexical Discrimination with the Italian Version of WORDNET</Title> <Section position="3" start_page="0" end_page="0" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> WORDNET is a thesaurus for the English language based on psycholinguistics principles and developed at the Princeton University by George Miller \[Miller, 1990\].</Paragraph> <Paragraph position="1"> It has been conceived as a computational resource, so improving some of the drawbacks of traditional dictionaries, such as the circularity of the definitions and the ambiguity of sense references. Lemmas (about 130,000 for version 1.5) are organized in synonyms classes (about 100,000 synsets).</Paragraph> <Paragraph position="2"> The more evident problem with WORDNET is that it is a lexical knowledge base for English, and so it is not usable for other languages. Here we present the efforts made in the development of the Italian version of WoRDNET \[Magnini and Strapparava, 1994; Magnini et al., 1994\], a project started at IltSW about one year ago in the context of ILEX \[Delmonte et al., 1996\] a more general project aiming at the realization of a computational dictionary for Italian 1.</Paragraph> <Paragraph position="3"> A second problem with WORDNET is that it needs some important extensions to make it usable for effective parsing. In particular, parsing requires a powerful mechanism for lexical discrimination, in order to select the appropriate lexical readings for each word in the input sentence. In this paper we also explore the integration of &quot;selectwnal res~mctwng', a traditional technique used Venezia, and the brench of the University of Torino at Vercelli.</Paragraph> <Paragraph position="4"> for lexical discrimination, with Italian WOttDNET. Selectional restrictions provide explicit semantic information that the verb supplies about its arguments \[Jackendoff, 1990\], and should be fully integrated into the verb's argument structure.</Paragraph> <Paragraph position="5"> Although selectional restrictions are different in different domains \[Basili et al., 1996\] we are interested in finding common invariants across sublanguages. It is our intention to build a very general instrument that can be afterwards tuned to particular domains by identifying more specific uses. The main motivation is to have both a robust and a computational efficient natural language system. On one hand, robustness is emphasized because sentences that are syntactically correct, but which are not successfully analyzed in the specific application domain, can have a valid linguistic meaning. On the other hand, we are able to filter the sentence meanings on a linguistic basis. This phase discards the unplausible lectures pruning the search space by looking for compatibility semantic relations. This kind of discrimination can be realized with computationally effective algorithms by exploiting the lexical taxonomy of WORDNET, postponing more complex and expensive computations to the domanin specific analysis.</Paragraph> <Paragraph position="6"> The paper is structured as follow. Section 2 describes the Italian prototype of WORDNET; while section 3 shows how selectional restrictions has been added to verb senses. Section 4 shows how Italian WOttDNET has been coupled with the parser, both for describing lexical senses and as a repository for selectional restrictions. Section 5 reports a number of experiments that has been performed to individuate the methodology design with the best trade-off between disambiguation rate and precision. Finally section 6 provides some conclusive remarks.</Paragraph> </Section> class="xml-element"></Paper>