File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/98/w98-1214_concl.xml
Size: 2,481 bytes
Last Modified: 2025-10-06 13:58:15
<?xml version="1.0" standalone="yes"?> <Paper uid="W98-1214"> <Title>CHOOSING A DISTANCE METRIC FOR AUTOMATIC WORD CATEGORIZATION</Title> <Section position="7" start_page="111" end_page="111" type="concl"> <SectionTitle> 6 Discussion And Conclusion </SectionTitle> <Paragraph position="0"> This research has focussed on the usage of distance function for an unsupervised, bottom-up algorithm for automatic word categorization. The results obtained seem to show that natural language preserves the necessary information implicitly for the acquisition of the linguistic categories it has. A convergence of linguistic categories could be obtained by using the algorithm we have presented. This result is a motivating one for further studies on acquisition of Korkmaz and G6ktark ~19oluk 118 Choosing A Distance Metric for Word Categorization world best least present e earth / streets, fire \ ' forest city floor,horses * pictur same room most once, last garden light darkness, path case first house whole baby, prisoner truth crowd court, watch glass door old doctor, wind river drawing, fact other gate, village scene, news sun, country windows, sick Korkmaz and G6ktfirk ~/C/oluk 119 Choosing A Distance Metn'c for Word Categorization structures preserved in natural language at various abstraction levels.</Paragraph> <Paragraph position="1"> Different distance metrics are used for the algorithm. The results obtained by the Combined Metric show that special distance metrics trying to combine different properties of linguistic elements could be developed for linguistic categorization.</Paragraph> <Paragraph position="2"> Considering the results obtained by the experiments carried out, the following remarks could be made on the linguistic clusters formed in the study.</Paragraph> <Paragraph position="3"> In the initial clusters formed the success rate obtained is satisfactory. Though it was not possible to to combine these initial clusters into exact linguistic categories, the cluster hierarchy obtained with Combined metric is encouraging. The faulty placements axe mainly due to the the very complex structure of natural language. The fact that many words can be used with different linguistic roles in natural language sentences produces deviations in the information given by the bigrams. Using fuzzy logic and a suitable distance metric is a way to decrease these deviations, however it was not possible to remove them totally.</Paragraph> </Section> class="xml-element"></Paper>