File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/99/w99-0701_concl.xml
Size: 2,067 bytes
Last Modified: 2025-10-06 13:58:35
<?xml version="1.0" standalone="yes"?> <Paper uid="W99-0701"> <Title>Unsupervised Learning of Word Boundary with Description Length Gain</Title> <Section position="6" start_page="4" end_page="4" type="concl"> <SectionTitle> 5. Conclusions and Future Work </SectionTitle> <Paragraph position="0"> We have presented an unsupervised learning algorithm for lexical acquisition based on the goodness measure description length gain formulated following information theory. The learning algorithnl follows the essence of the MDL principle to search for the optimal segmentation of an utterance that has the maximal description length gain (and therefore approaches the minimum description length of the utterance). Experiments on word boundary prediction with large-scale corpora have shown the effectiveness of the learning algorithm.</Paragraph> <Paragraph position="1"> For the time being, however, we are unable to compare the learning performance with other researchers' previous work, simply because they do not present the performance of their learning algorithms in terms of the criteria of both precision and recall. Also, our algorithm is significantly simpler, in that it rests on n-gram counts only, instead of any more complicated statistical data or a more sophisticated training algorithm.</Paragraph> <Paragraph position="2"> Our future work will focus on the investigation into two aspects of the lexical learning with the DLG measure. First, we will incorporate tile expectation-maximization (EM) algorithm \[Dempster et al. 1977\] into our lexical learning to see how nmch performance can be improved.</Paragraph> <Paragraph position="3"> Usually, a more sophisticated learning algorithm leads to a better learning result. Second, we will explore the hierarchical chunking with the DLG measure. We are particularly interested to know how nmch nmre compression effect can be further squeezed out by hierarchical chunking from a text corpus (e.g., the Brown corpus) and how much improvement in the recall can be achieved.</Paragraph> </Section> class="xml-element"></Paper>