File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/p06-1027_concl.xml
Size: 2,052 bytes
Last Modified: 2025-10-06 13:55:13
<?xml version="1.0" standalone="yes"?> <Paper uid="P06-1027"> <Title>Sydney, July 2006. c(c)2006 Association for Computational Linguistics Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling</Title> <Section position="7" start_page="214" end_page="214" type="concl"> <SectionTitle> 6 Conclusions and further directions </SectionTitle> <Paragraph position="0"> We have presented a new semi-supervised training algorithm for CRFs, based on extending minimum conditional entropy regularization to the structured prediction case. Our approach is motivated by the information-theoretic argument (Grandvalet and Bengio 2004; Roberts et al. 2000) that unlabeled examples can provide the most benefit when classes have small overlap. An iterative ascent optimization procedure was developed for this new criterion, which exploits a nested dynamic programming approach to efficiently compute the covariance matrix of the features.</Paragraph> <Paragraph position="1"> We applied our new approach to the problem of identifying gene name occurrences in biological text, exploiting the availability of auxiliary unlabeled data to improve the performance of the state of the art supervised CRF approach in this domain. Our semi-supervised CRF approach shares all of the benefits of the standard CRF training, including the ability to exploit arbitrary features of the inputs, while obtaining improved accuracy through the use of unlabeled data. The main drawback is that training time is increased because of the extra nested loop needed to calculate feature covariances. Nevertheless, the algorithm is sufficiently efficient to be trained on unlabeled data sets that yield a notable improvement in classification accuracy over standard supervised training. To further accelerate the training process of our semi-supervised CRFs, we may apply stochastic gradient optimization method with adaptive gain adjustment as proposed by Vishwanathan et al.</Paragraph> <Paragraph position="2"> (2006).</Paragraph> </Section> class="xml-element"></Paper>