File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/abstr/04/w04-2902_abstr.xml

Size: 916 bytes

Last Modified: 2025-10-06 13:44:09

<?xml version="1.0" standalone="yes"?>
<Paper uid="W04-2902">
  <Title>Analysis and Processing of Lecture Audio Data: Preliminary Investigations</Title>
  <Section position="1" start_page="0" end_page="0" type="abstr">
    <SectionTitle>
Abstract
</SectionTitle>
    <Paragraph position="0"> In this paper we report on our recent efforts to collect a corpus of spoken lecture material that will enable research directed towards fast, accurate, and easy access to lecture content. Thus far, we have collected a corpus of 270 hours of speech from a variety of undergraduate courses and seminars. We report on an initial analysis of the spontaneous speech phenomena present in these data and the vocabulary usage patterns across three courses. Finally, we examine language model perplexities trained from written and spoken materials, and describe an initial recognition experiment on one course.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML