File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/05/w05-0606_concl.xml
Size: 1,336 bytes
Last Modified: 2025-10-06 13:54:57
<?xml version="1.0" standalone="yes"?> <Paper uid="W05-0606"> <Title>Computing Word Similarity and Identifying Cognates with Pair Hidden Markov Models</Title> <Section position="9" start_page="45" end_page="46" type="concl"> <SectionTitle> 7 Conclusion </SectionTitle> <Paragraph position="0"> We created a system that learns to recognize word pairs that are similar based on some criteria provided during training, and separate such word pairs from those that do not exhibit such similarity or whose similarity exists solely by chance. The system is based on Pair Hidden Markov Models, a technique adapted from the field of bioinformatics. We tested a number of training algorithms and model variations on the task of identifying cognates. However, since it does not rely on domain-specific knowledge, our system can be applied to any task that requires computing word similarity, as long as there are examples of words that would be considered similar in a given context.</Paragraph> <Paragraph position="1"> In the future, we would like to extend our system by removing the one-to-one constraint that requires alignments to consist of single symbols. It would also be interesting to test the system in other applications, such as the detection of confusable drug names or word alignment in bitexts.</Paragraph> </Section> class="xml-element"></Paper>