File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/relat/06/p06-1040_relat.xml
Size: 5,754 bytes
Last Modified: 2025-10-06 14:15:50
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1040"> <Title>Expressing Implicit Semantic Relations without Supervision</Title> <Section position="5" start_page="313" end_page="314" type="relat"> <SectionTitle> 3 Related Work </SectionTitle> <Paragraph position="0"> Hearst (1992) describes a method for finding patterns like &quot;Y such as the X&quot;, but her method requires human judgement. Berland and Charniak (1999) use Hearst's manual procedure.</Paragraph> <Paragraph position="1"> Riloff and Jones (1999) use a mutual bootstrapping technique that can find patterns automatically, but the bootstrapping requires an initial seed of manually chosen examples for each class of words. Miller et al. (2000) propose an approach to relation extraction that was evaluated in the Seventh Message Understanding Conference (MUC7). Their algorithm requires labeled examples of each relation. Similarly, Zelenko et al. (2003) use a supervised kernel method that requires labeled training examples.</Paragraph> <Paragraph position="2"> Agichtein and Gravano (2000) also require training examples for each relation. Brin (1998) uses bootstrapping from seed examples of author:title pairs to discover patterns for mining further pairs.</Paragraph> <Paragraph position="3"> Yangarber et al. (2000) and Yangarber (2003) present an algorithm that can find patterns automatically, but it requires an initial seed of manually designed patterns for each semantic relation. Stevenson (2004) uses WordNet to extract relations from text, but also requires initial seed patterns for each relation.</Paragraph> <Paragraph position="4"> Lapata (2002) examines the task of expressing the implicit relations in nominalizations, which are noun compounds whose head noun is derived from a verb and whose modifier can be interpreted as an argument of the verb. In contrast with this work, our algorithm is not restricted to nominalizations. Section 6 shows that our algorithm works with arbitrary noun compounds and the SAT questions in Section 5 include all nine possible pairings of nouns, verbs, and adjectives.</Paragraph> <Paragraph position="5"> As far as we know, our algorithm is the first unsupervised learning algorithm that can find patterns for semantic relations, given only a large corpus (e.g., in our experiments, about 10105 x words) and a moderately sized set of word pairs (e.g., 600 or more pairs in the experiments), such that the members of each pair appear together frequently in short phrases in the corpus. These word pairs are not seeds, since the algorithm does not require the pairs to be labeled or grouped; we do not assume they are homogenous.</Paragraph> <Paragraph position="6"> The word pairs that we need could be generated automatically, by searching for word pairs that co-occur frequently in the corpus. However, our evaluation methods (Sections 5 and 6) both involve a predetermined list of word pairs. If our algorithm were allowed to generate its own word pairs, the overlap with the predetermined lists would likely be small. This is a limitation of our evaluation methods rather than the algorithm.</Paragraph> <Paragraph position="7"> Since any two word pairs may have some relations in common and some that are not shared, our algorithm generates a unique list of patterns for each input word pair. For example, mason:stone and carpenter:wood share the pattern &quot;X carves Y&quot;, but the patterns &quot;X nails Y&quot; and &quot;X bends Y&quot; are unique to carpenter:wood. The ranked list of patterns for a word pair YX : gives the relations between X and Y in the corpus, sorted with the most pertinent (i.e., characteristic, distinctive, unambiguous) relations first.</Paragraph> <Paragraph position="8"> Turney (2005) gives an algorithm for measuring the relational similarity between two pairs of words, called Latent Relational Analysis (LRA).</Paragraph> <Paragraph position="9"> This algorithm can be used to solve multiple-choice word analogy questions and to classify noun-modifier pairs (Turney, 2005), but it does not attempt to express the implicit semantic relations. Turney (2005) maps each pair YX : to a high-dimensional vector v . The value of each element iv in v is based on the frequency, for the pair YX : , of a corresponding pattern iP .</Paragraph> <Paragraph position="10"> The relational similarity between two pairs, 11 :YX and 22 :YX , is derived from the cosine of the angle between their two vectors. A limitation of this approach is that the semantic content of the vectors is difficult to interpret; the magnitude of an element iv is not a good indicator of how well the corresponding pattern iP expresses a relation of YX : . This claim is supported by the experiments in Sections 5 and 6.</Paragraph> <Paragraph position="11"> Pertinence (as defined in Section 2) builds on the measure of relational similarity in Turney (2005), but it has the advantage that the semantic content can be interpreted; we can point to specific patterns and say that they express the implicit relations. Furthermore, we can use the patterns to find other pairs with the same relations. Hearst (1992) processed her text with a part-of-speech tagger and a unification-based constituent analyzer. This makes it possible to use more general patterns. For example, instead of the literal string pattern &quot;Y such as the X&quot;, where X and Y are words, Hearst (1992) used the more abstract pattern &quot; 0NP such as 1NP &quot;, where iNP represents a noun phrase. For the sake of simplicity, we have avoided part-of-speech tagging, which limits us to literal patterns. We plan to experiment with tagging in future work.</Paragraph> </Section> class="xml-element"></Paper>