File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-1053_concl.xml
Size: 2,261 bytes
Last Modified: 2025-10-06 13:55:12
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-1053"> <Title>Towards Spoken-Document Retrieval for the Internet: Lattice Indexing For Large-Scale Web-Search Architectures</Title> <Section position="7" start_page="419" end_page="421" type="concl"> <SectionTitle> 6 Conclusion </SectionTitle> <Paragraph position="0"> We targeted the paper to the task of searching audio content from the Internet. Aiming at maximizing reuse of existing web-search engines, we investigated how best to text from meta-data, the system displays recognition-transcript snippets around the audio hits, e.g. &quot;... bird flu has been a ...&quot; in the first document. Clicking on a word in a snippet starts playing back the video at that position using the embedded video player.</Paragraph> <Paragraph position="1"> represent important lattice properties - recognition alternates with scores, time boundaries, and phrase-matching constraints - in a form suitable for large-scale web-search engines, while requiring only limited code changes.</Paragraph> <Paragraph position="2"> The proposed method, Time-based Merging for Indexing (TMI), first converts the word lattice to a posterior-probability representation and then merges word hypotheses with similar time boundaries to reduce the index size.</Paragraph> <Paragraph position="3"> Four approximations were presented, which differ in size and the strictness of phrase-matching constraints.</Paragraph> <Paragraph position="4"> Results were presented for three typical types of web audio content - podcasts, video clips, and online lectures - for phrase spotting and relevance ranking. Using TMI indexes that are only five times larger than corresponding linear-text indexes, accuracy was improved over searching top-1 transcripts by 25-35% for word spotting and 14% for relevance ranking, very close to what is gained by a direct search of unindexed lattices.</Paragraph> <Paragraph position="5"> Practical feasibility has been demonstrated by a research prototype with 780 hours indexed audio, which completes searches within 0.5 seconds.</Paragraph> <Paragraph position="6"> To our knowledge, this is also the first paper to report speech recognition results for podcasts.</Paragraph> </Section> class="xml-element"></Paper>