File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/92/c92-2083_abstr.xml

Size: 4,360 bytes

Last Modified: 2025-10-06 13:47:28

<?xml version="1.0" standalone="yes"?>
<Paper uid="C92-2083">
  <Title>Structural Patterns vs. String Palterns for Extracting Semantic Information from Dictionari~</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
1. Introduction
</SectionTitle>
    <Paragraph position="0"> As tile research on extracting semantic information from on-line dictionaries proceeds, most progress Iris been made in the area of extracting the genus terms. Two methods are being used -- pattern matching at the string level and at the structural analysis level -- t~th of which seem to yield equally promising results.</Paragraph>
    <Paragraph position="1"> Little theoretical work, however, is being doue to determine the set of possible differentiae to be identified, and therefore also the set of possible ,semantic relations that can be extracted from them. lit fact, Wilks remarks that as far as identifying the differenliae and organizing that information into a list of properties is concerned, &amp;quot;sucb demands are beyond the abilities of lhe best current extraction techuiqaes&amp;quot; (Wilks et al. 1989, p.227). However, the current stile of the art in computational linguistics demands that semantic information beyond genus terms be available now, on a large scale, to push forward the current theories, whetber that is knowledge-based parsing or parsing first with a syntactic component, followed by a semantic component.</Paragraph>
    <Paragraph position="2"> In this paper, we will focus on analyzing the definitions not for the genus terms, but for the semantic relations that can be extracted from the differentiae (Calzolari 1984).</Paragraph>
    <Paragraph position="3"> Although many have accepted the use of syntactic analyses for this purpose for some time now (for example Jeosen and Binot 1987, Klavans 1990, Ravin 1990, and Vanderwende 1990, all of which use the PLNLP F~lglish Parser to provide the structural information), many others still do not. We will demonstrate with examples why only patterns based on syntactic information (henceforth, structural patterns) provide reliable semantic relations for the differentiae. Patterns that match definition text at the string level (henceforth, striug patterns) are conceivable, but cannot capture the variations in the differentiae as easily as structural patterns. In addition, although it is possible to parse the definition texts using a grammar designed for one dictionary (e.g. a grammar of &amp;quot;Longmanese,&amp;quot; see Alshawi 1989), we have found that a general, broad-coverage grammar of English or of Italian provides a level of analysis that is as good as, and possibly superior to, a dictionaryspecific grammar I. In addition, there is up extra effort required to apply a broad-coverage text parser to the definitions of more than one dictionary, as we found for the Longman Dictionary of Contemporary English (henceforth, LDOCE) and Webster's 7th New Collegiate Dictionary (henceforth, W7) for English, and for II Nuovo Dizionario Garzanti (henceforth, Garzanti) and Italian DMI Database (henceforth, DMI) for Italian.</Paragraph>
    <Paragraph position="4"> The result of analyzing the differentiae of the definitions is presented in the form of a semantic frame; there is one semantic frame for each word sense of the entry. The contents of the frame will be any number of semantic relatioas (including the genus term) with, as values, the word(s) extracted from the definition text. Except for a commitment to the theoretical notion that a word has distinguishable sense,s, the semantic frames are intended to be tbeory-independent. The semantic frames presented in this paper correspond to a description of the semantic frames produced by the lexicon-producer (Wilks, pp. cit., p.</Paragraph>
    <Paragraph position="5"> 217-220) and so can be the input to a knowledge-based parser. Also, these semantic frames represent the appropriate level of semantic information that is needed by a semantic component that has the task of resolving the ambiguities remaining after a syntactic component has assigned an initial analysis (,see Jensen &amp; Binot 1987, Vanderwende 1990). More generally, the result of this acquisition process is the construction of a Lexical Knowledge Base to be used as a component for any NLP system.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML