File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-1011_concl.xml
Size: 1,301 bytes
Last Modified: 2025-10-06 13:55:03
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-1011"> <Title>Approximate Searching for Distributional Similarity</Title> <Section position="9" start_page="102" end_page="103" type="concl"> <SectionTitle> 8 Conclusion </SectionTitle> <Paragraph position="0"> We have integrated a nearest-neighbour approximation data structure, the Spacial Approximation Sample Hierarchy (SASH), with a state-of-the-art distributional similarity system. In the process we have extended the original SASH construction algorithms (Houle, 2003b) to deal with the non-uniform distribution of words within semantic space.</Paragraph> <Paragraph position="1"> We intend to test other similarity measures and node ordering strategies, including a more linguistic analysis using WordNet, and further explore the interaction between the canonical vector heuristic and the SASH. The larger 300 word evaluation set used by Curran (2004) will be used, and combined with a more detailed analyis. Finally, we plan to optimise our SASH implementation so that it is comparable with the highly optimised nearest-neighbour code.</Paragraph> <Paragraph position="2"> The result is distributional similarity calculated three times faster than existing systems with only a minor accuracy penalty.</Paragraph> </Section> class="xml-element"></Paper>