File Information
File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-2028_evalu.xml
Size: 2,326 bytes
Last Modified: 2025-10-06 13:59:40
<?xml version="1.0" standalone="yes"?> <Paper uid="N06-2028"> <Title>Extracting Salient Keywords from Instructional Videos Using Joint Text, Audio and Visual Cues</Title> <Section position="5" start_page="111" end_page="111" type="evalu"> <SectionTitle> 4 Experimental Results </SectionTitle> <Paragraph position="0"> Four DHS videos were used in the experiment, which contain diverse topics ranging from bio-terrorism history, weapons of mass destruction, to school preparation for terrorism. The video length also varies a lot from 30 minutes to 2 hours. Each video also contains a variety of sub-topics. Video transcripts were acquired by extracting the closed captions with our own application.</Paragraph> <Paragraph position="1"> To evaluate system performance, we compare the key-words generated from our system against the human-generated gold standard. Note that for this experiment, we only consider nouns and noun phrases as keywords.</Paragraph> <Paragraph position="2"> To collect the ground truth, we invited a few human evaluators, showed them the four test videos, and presented them with all candidate keywords extracted by GlossEx.</Paragraph> <Paragraph position="3"> We then asked them to label all keywords that they considered to be domain-specific, which is guidelined by the following question: &quot;would you be satisfied if you get this video when you use this keyword as a search term?&quot;.</Paragraph> <Paragraph position="4"> Table 1 shows the number of candidate keywords and manually labeled salient keywords for all four test videos.</Paragraph> <Paragraph position="5"> As we can see, approximately 50% of candidate key-words were judged to be domain-specific by humans.</Paragraph> <Paragraph position="6"> Based on this observation, we selected the top 50% of highly ranked keywords based on the adjusted salience, and examined their presence in the pool of salient key-words for each video. As a result, an average of 82% of salient keywords were identified within these top 50% of re-ranked keywords. In addition, the audiovisual cues improve precision and recall by 1.1% and 1.5% respectively. null videos v1 v2 v3 v4 no. of candidate keywords 477 934 1303 870 no. of salient keywords 253 370 665 363 ratio of salient keywords 53% 40% 51% 42%</Paragraph> </Section> class="xml-element"></Paper>