File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/99/p99-1041_evalu.xml
Size: 7,215 bytes
Last Modified: 2025-10-06 14:00:39
<?xml version="1.0" standalone="yes"?> <Paper uid="P99-1041"> <Title>Automatic Identification of Non-compositional Phrases</Title> <Section position="8" start_page="319" end_page="320" type="evalu"> <SectionTitle> 6 Evaluation </SectionTitle> <Paragraph position="0"> There is not yet a well-established methodology for evaluating automatically acquired lexical knowledge. One possibility is to compare the automatically identified relationships with relationships listed in a manually compiled dictionary. For example, (Lin, 1998) compared automatically created thesaurus with the WordNet (Miller et al., 1990) and Roget's Thesaurus. However, since the lexicon used in our parser is based on the WordNet, the phrasal words in WordNet are treated as a single word.</Paragraph> <Paragraph position="1"> For example, &quot;take advantage of&quot; is treated as a transitive verb by the parser. As a result, the extracted non-compositional phrases do not usually overlap with phrasal entries in the WordNet.</Paragraph> <Paragraph position="2"> Therefore, we conducted the evaluation by manually examining sample results. This method was also used to evaluate automatically identified hyponyms (Hearst, 1998), word similarity (Richardson, 1997), and translations of collocations (Smadja et al., 1996).</Paragraph> <Paragraph position="3"> Our evaluation sample consists of 5 most frequent open class words in the our parsed corpus: {have, company, make, do, take} and 5 words whose frequencies are ranked from 2000 to 2004: {path, lock, resort, column, gulf}. We examined three types of dependency relationships: object-verb, noun-noun, and adjective-noun. A total of 216 collocations were extracted, shown in Appendix A.</Paragraph> <Paragraph position="4"> We compared the collocations in Appendix A with the entries for the above 10 words in the NTC's English Idioms Dictionary (henceforth NTC-EID) (Spears and Kirkpatrick, 1993), which contains approximately 6000 definitions of idioms. For our evaluation purposes, we selected the idioms in NTC-EID that satisfy both of the following two conditions: (4) a. the head word of the idiom is one of the above 10 words.</Paragraph> <Paragraph position="5"> b. there is a verb-object, noun-noun, or adjective-noun relationship in the idiom and the modifier in the phrase is not a variable. For example, &quot;take a stab at something&quot; is included in the evaluation, whereas &quot;take something at face value&quot; is not.</Paragraph> <Paragraph position="6"> There are 249 such idioms in NTC-EID, 34 of which are also found in Appendix A (they are marked with the '+' sign in Appendix A). If we treat the 249 entries in NTC-EID as the gold standard, the precision and recall of the phrases in Appendix A are shown in Table 4, To compare the performance with manually compiled dictionaries, we also compute the precision and recall of the entries in the Longman Dictionary of English Idioms (LDOEI) (Long and Summers, 1979) that satisfy the two conditions in (4). It can be seen that the overlap between manually compiled dictionaries are quite low, reflecting the fact that different lexicographers may have quite different opinion about which phrases are non-compositional.</Paragraph> <Paragraph position="7"> The collocations in Appendix A are classified into three categories. The ones marked with '+' sign are found in NTC-EID. The ones marked with 'x' are parsing errors (we retrieved from the parsed corpus all the sentences that contain the collocations in Appendix A and determine which collocations are parser errors). The unmarked collocations satisfy the condition (3) but are not found in NTC-EID.</Paragraph> <Paragraph position="8"> Many of the unmarked collocation are clearly idioms, such as &quot;take (the) Fifth Amendment&quot; and &quot;take (its) toll&quot;, suggesting that even the most comprehensive dictionaries may have many gaps in their coverage. The method proposed in this paper can be used to improve the coverage manually created lexical resources.</Paragraph> <Paragraph position="9"> Most of the parser errors are due to the incompleteness of the lexicon used by the parser. For example, &quot;opt&quot; is not listed in the lexicon as a verb. The lexical analyzer guessed it as a noun, causing the erroneous collocation &quot;(to) do opt&quot;. The collocation &quot;trig lock&quot; should be &quot;trigger lock&quot;. The lexical analyzer in the parser analyzed &quot;trigger&quot; as the -er form of the adjective &quot;trig&quot; (meaning wellgroomed). null Duplications in the corpus can amplify the effect of a single mistake. For example, the following disclaimer occurred 212 times in the corpus.</Paragraph> <Paragraph position="10"> &quot;Annualized average rate of return after expenses for the past 30 days: not a forecast of future returns&quot; The parser analyzed '% forecast of future returns&quot; as \[S \[NP a forecast of future\] \[VP returns\]\]. As a result, (return V:subj :N forecast) satisfied the condition (3).</Paragraph> <Paragraph position="11"> Duplications can also skew the mutual information of correct dependency relationships. For example, the verb-object relationship between &quot;take&quot; and &quot;bride&quot; passed the mutual information filter because there are 4 copies of the article containing this phrase. If we were able to throw away the duplicates and record only one count of &quot;take-bride&quot;, it would have not pass the mutual information filter (3).</Paragraph> <Paragraph position="12"> The fact that systematic parser errors tend to pass the mutual information filter is both a curse and a blessing. On the negative side, there is no obvious way to separate the parser errors from true non-compositional expressions. On the positive side, the output of the mutual information filter has much higher concentration of parser errors than the database that contains millions of collocations. By manually sifting through the output, one can construct a list of frequent parser errors, which can then be incorporated into the parser so that it can avoid making these mistakes in the future. Manually going through the output is not unreasonable, because each non-compositional expression has to be individually dealt with in a lexicon anyway.</Paragraph> <Paragraph position="13"> To find out the benefit of using the dependency relationships identified by a parser instead of simple co-occurrence relationships between words, we also created a database of the co-occurrence relationship between part-of-speech tagged words. We aggregated all word pairs that occurred within a 4-word window of each other. The same algorithm and similarity measure for the dependency database are used to construct a thesaurus using the co-occurrence database. Appendix B shows all the word pairs that satisfies the condition (3) and that involve one of the 10 words {have, company, make, do, take, path, lock, resort, column, gulf}. It is clear that Appendix B contains far fewer true non-compositional phrases than Appendix A.</Paragraph> </Section> class="xml-element"></Paper>