File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-1006_concl.xml
Size: 1,668 bytes
Last Modified: 2025-10-06 13:55:02
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1006"> <Title>Automatic Extraction of Idioms using Graph Analysis and Asymmetric Lexicosyntactic Patterns</Title> <Section position="6" start_page="54" end_page="54" type="concl"> <SectionTitle> 6 Conclusions and Further Work </SectionTitle> <Paragraph position="0"> It is possible to extract asymmetric constructions from text, some of which correspond to idioms which are indecomposable (in the sense that their meaning cannot be decomposed into a combination of the meanings of their constituent words).</Paragraph> <Paragraph position="1"> Many other phrases were extracted which exhibit a typical directionality that follows from underlying semantic principles. While these are sometimes not defined as 'idioms' (because they are still composable), knowledge of their asymmetric behaviour is necessary for a system to generate natural language utterances that would sound 'idiomatic' to native speakers.</Paragraph> <Paragraph position="2"> While all of this information is useful for correctly interpreting and generating natural language, further work is necessary to distinguish accurately between these different categories. The first step in this process will be to manually classify the results, and evaluate the performance of different classification techniques to see if they can reliably identify different types of idiom, and also distinguish these cases from false positives that were mistakenly extracted. Once some of these techniques have been evaluated, we will be in a better position to broaden our techniques by turning to larger corpora such as the Web.</Paragraph> </Section> class="xml-element"></Paper>