File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/intro/98/p98-2231_intro.xml
Size: 3,871 bytes
Last Modified: 2025-10-06 14:06:38
<?xml version="1.0" standalone="yes"?> <Paper uid="P98-2231"> <Title>Structural Disambiguation Based on Reliable Estimation of Strength of Association Haodong Wu Eduardo de Paiva Alves</Title> <Section position="2" start_page="0" end_page="1416" type="intro"> <SectionTitle> 1 Introduction </SectionTitle> <Paragraph position="0"> The strength of association between words provides lexical preferences for ambiguity resolution. It is usually estimated from statistics on word co-occurrences in large corpora (Hindle and Rooth, 1993). A problem with this approach is how to estimate the probability of word co-occurrences that are not observed in the training corpus. There are two main approaches to estimate the probability: smoothing methods (e.g., Church and Gale, 1991; Jelinek and Mercer, 1985; Katz, 1987) and class-based methods (e.g., Brown et al., 1992; Pereira and Tishby, 1992; Resnik, 1992; Yarowsky, 1992).</Paragraph> <Paragraph position="1"> Smoothing methods estimate the probability of the unobserved co-occurrences by using frequencies of the individual words. For exampie, when eat and bread do not co-occur, the probability of (eat, bread) would be estimated by using the frequency of (eat) and (bread).</Paragraph> <Paragraph position="2"> A problem with this approach is that it pays no attention to the distributional characteristics of the individual words in question. Using this method, the probability of (eat, bread> and (eat, cars) would become the same when bread and cars have the same frequency. It is unacceptable from the linguistic point of view.</Paragraph> <Paragraph position="3"> Class-based methods, on the other hand, estimate the probabihties by associating a class with each word and collecting statistics on word class co-occurrences. For instance, instead of calculating the probability of (eat, bread) directly, these methods associate eat with the class \[ingest\] and bread with tile class \[food\] and collect statistics on the classes \[ingest\] and \[food\]. The accuracy of the estimation depends on the choice of classes, however. Some class-based methods (e.g., Yarowsky, 1992) associate each word with a single class without considcring the other words in the co-occurrence. However, a word may need to be replaced by different class depending on the co-occurrence. Some classes may not have enough occurrences to allow a reliable estimation, while other classes may be too general and include too many words not relevant to the estimation. An alternative is to obtain various classes associated in a taxonomy with the words in question and select the classes according to a certain criteria.</Paragraph> <Paragraph position="4"> There are a number of ways to select the classes used in the estimation. Weischedel et al.</Paragraph> <Paragraph position="5"> (1993) chose the lowest classes in a taxonomy for which the association for the co-occurrence can be estimated. This approach may result in unreliable estimates, since some of the class co-occurrences used may be attributed to chance.</Paragraph> <Paragraph position="6"> Resnik (1993) selected all pairs of classes corresponding to the head of a prepositional phrase and weighted them to bias the computation of the association in favor of higher-frequency co-occurrences which he considered &quot;more reliable.&quot; Contrary to this assumption, high frequency co-occurrences axe unreliable when the probability that the co-occurrence may be attributed to chance is high.</Paragraph> <Paragraph position="7"> In this paper we propose a class-based method that selects the lowest classes in a taxonomy for which the co-occurrence confidence is above a threshold. We subsequently apply the method to solving structural ambiguities in Japanese dependency structures and English prepositional phrase attachments.</Paragraph> </Section> class="xml-element"></Paper>