File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/03/n03-2007_concl.xml
Size: 1,308 bytes
Last Modified: 2025-10-06 13:53:31
<?xml version="1.0" standalone="yes"?> <Paper uid="N03-2007"> <Title>Active Learning for Classifying Phone Sequences from Unsupervised Phonotactic Models</Title> <Section position="7" start_page="0" end_page="0" type="concl"> <SectionTitle> 6 Discussion </SectionTitle> <Paragraph position="0"> By actively choosing the examples with the lowest confidence scores first, we can get the same classification results with around 60-70% of the utterances labeled in HMIHY and TTSHD. But we want to optimize labeling effort, which is presumably some combination of a fixed amount of effort per utterance plus a &quot;listening effort&quot; proportional to utterance length. We therefore augmented our active learning selection to include a constraint on the length of the utterances, measured in recognized phones.</Paragraph> <Paragraph position="1"> If we simply take effort to be proportional to the number of phones in the utterances selected (likely to result in a conservative estimate of savings), the effort reduction at 4,000 utterances is around 30% even for the more complex HMIHY domain. Further investigation is needed into the best way to measure overall labeling effort, and into refinements of the active learning process to optimize that labeling effort.</Paragraph> </Section> class="xml-element"></Paper>