File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/92/c92-2099_relat.xml
Size: 2,605 bytes
Last Modified: 2025-10-06 14:16:04
<?xml version="1.0" standalone="yes"?> <Paper uid="C92-2099"> <Title>ACQUISITION OF SELECTIONAL PATTERNS</Title> <Section position="8" start_page="0" end_page="0" type="relat"> <SectionTitle> 7 Related Work </SectionTitle> <Paragraph position="0"> At NYU we have long been interested in the possibilities of automatically acquiring sublanguage (semantic) word classes and patterns from text corpora. In 1975 we reported on experiments -- using a few hundred manuMly prepared regularized parses --- for clustering words based on their co-occurrence patterns and thus generating the principal sublanguage word classes for a domain \[1\]. In the early 1980's we performed experiments, again with relatively small corpora and machine-generated (but manually selected) parses, for collecting snblanguage patterns, similar to the work reported here \[9\]. By studying the growth curves of size of text sample vs. number of patterns, we attempted to estimate at that time the completeness of the subtanguage patterns we obtained.</Paragraph> <Paragraph position="1"> More recently there has been a surge of interest in such corpus-based studies of lexicai co occurrence patterns (e.g., \[1{},11,12,13\]). The recent volume edited by Zernik \[14\] reviews many of these efforts. We mention only two of these here, one seeking a similar range of patterns, the other using several ewduation methods.</Paragraph> <Paragraph position="2"> Velardi et al. \[11\] are using co-occurence data to build a &quot;semantic lexicon&quot; with information about the conceptual classes of the arguments and modifiers of lexical items. This informatlon is closely related to our selectional patterns, although the function'a\] relations are semantic or conceptual whereas ours are syntactic. They use manually-encoded coarse-grained selectional constraints to limit the patterns which are generated. No evaluation results are yet reported.</Paragraph> <Paragraph position="3"> IIindle aml Rooth \[10\] h~ve used co-occurrence data to determine whether prepositional phrases should be attached to a preceding noun or verb.</Paragraph> <Paragraph position="4"> Unambiguous cases in the corpus are identified first; co-occurrence statistics based on these are then used iteratively to resolve ambiguous cases.</Paragraph> <Paragraph position="5"> A detailed evaluation of the predictive power of the resulting p~tterns is provided, comparing the patterns against human judgements over a set of 1909 sentences, aud analyzing the error rate in terms of the type of verh and noun association.</Paragraph> </Section> class="xml-element"></Paper>