File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-1053_concl.xml

Size: 2,261 bytes

Last Modified: 2025-10-06 13:55:12

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-1053">
  <Title>Towards Spoken-Document Retrieval for the Internet: Lattice Indexing For Large-Scale Web-Search Architectures</Title>
  <Section position="7" start_page="419" end_page="421" type="concl">
    <SectionTitle>
6 Conclusion
</SectionTitle>
    <Paragraph position="0"> We targeted the paper to the task of searching audio content from the Internet. Aiming at maximizing reuse of existing web-search engines, we investigated how best to  text from meta-data, the system displays recognition-transcript snippets around the audio hits, e.g. &amp;quot;... bird flu has been a ...&amp;quot; in the first document. Clicking on a word in a snippet starts playing back the video at that position using the embedded video player.</Paragraph>
    <Paragraph position="1"> represent important lattice properties - recognition alternates with scores, time boundaries, and phrase-matching constraints - in a form suitable for large-scale web-search engines, while requiring only limited code changes.</Paragraph>
    <Paragraph position="2"> The proposed method, Time-based Merging for Indexing (TMI), first converts the word lattice to a posterior-probability representation and then merges word hypotheses with similar time boundaries to reduce the index size.</Paragraph>
    <Paragraph position="3"> Four approximations were presented, which differ in size and the strictness of phrase-matching constraints.</Paragraph>
    <Paragraph position="4"> Results were presented for three typical types of web audio content - podcasts, video clips, and online lectures - for phrase spotting and relevance ranking. Using TMI indexes that are only five times larger than corresponding linear-text indexes, accuracy was improved over searching top-1 transcripts by 25-35% for word spotting and 14% for relevance ranking, very close to what is gained by a direct search of unindexed lattices.</Paragraph>
    <Paragraph position="5"> Practical feasibility has been demonstrated by a research prototype with 780 hours indexed audio, which completes searches within 0.5 seconds.</Paragraph>
    <Paragraph position="6"> To our knowledge, this is also the first paper to report speech recognition results for podcasts.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML