File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/97/w97-0314_concl.xml

Size: 2,597 bytes

Last Modified: 2025-10-06 13:57:51

<?xml version="1.0" standalone="yes"?>
<Paper uid="W97-0314">
  <Title>Inducing Terminology for Lexical Acquisition</Title>
  <Section position="8" start_page="131" end_page="132" type="concl">
    <SectionTitle>
6 Conclusions
</SectionTitle>
    <Paragraph position="0"> In this paper a method for the automatic extraction of terminological (possibly complex) units of information from corpora is presented. The proposed method combines principle of grammatical correctness with statistical constraints on the distributional 7 Precision is the number of detected correct esl's over the total number of detected esl's, while recall is the number of detected correct esl's over the number of correct esl~s  properties of the detected domain terms. In an incremental fashion NPs are first selected as possible candidates for term denotation and then inserted in an incremental terminological dictionary according to their mutual information value. The experimental test has been difficult as a precise notion of what is a relevant term in a domain is very vague and subjective. Tests against a domain specific user oriented dictionary have been carried out, in comparison with large scale thesaura in the domain. The significant improvement against this standard sources is very successful. The method has been widely applied to different corpora and it demonstrated to be easily portable without any heavy customization. As it relies upon simple POS tagging, it is widely portable to other languages, as soon as NP grammars are available. Feedback of the terminological extraction process to the morphologic analysis has been also designed. A measure of the improvement that terminological NP recognition implies over the activity of a shallow parser for LA has been carried out. The result is an overall improvement: data compression is around 5% while syntactic ambiguity elimination is about 10%. Recall and Precision of the syntactic analysis is consequently higher.</Paragraph>
    <Paragraph position="1"> The main result of this method is to support finer lexicalization, in form of complex nominals, for lexical acquisition. Lexical acquisition based on collocations between terms (and not simple lemmas) provides more granular information on lexical senses as well as (syntactic or semantic) selectional constraints. The success of this method allow to design automatic methods for taxonomic (thesaurus-like) knowledge generation. Distributional, as well syntactic, knowledge is a crucial source of information for large scale similarity estimation among detected terms.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML