File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/evalu/06/n06-2028_evalu.xml

Size: 2,326 bytes

Last Modified: 2025-10-06 13:59:40

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2028">
  <Title>Extracting Salient Keywords from Instructional Videos Using Joint Text, Audio and Visual Cues</Title>
  <Section position="5" start_page="111" end_page="111" type="evalu">
    <SectionTitle>
4 Experimental Results
</SectionTitle>
    <Paragraph position="0"> Four DHS videos were used in the experiment, which contain diverse topics ranging from bio-terrorism history, weapons of mass destruction, to school preparation for terrorism. The video length also varies a lot from 30 minutes to 2 hours. Each video also contains a variety of sub-topics. Video transcripts were acquired by extracting the closed captions with our own application.</Paragraph>
    <Paragraph position="1"> To evaluate system performance, we compare the key-words generated from our system against the human-generated gold standard. Note that for this experiment, we only consider nouns and noun phrases as keywords.</Paragraph>
    <Paragraph position="2"> To collect the ground truth, we invited a few human evaluators, showed them the four test videos, and presented them with all candidate keywords extracted by GlossEx.</Paragraph>
    <Paragraph position="3"> We then asked them to label all keywords that they considered to be domain-specific, which is guidelined by the following question: &amp;quot;would you be satisfied if you get this video when you use this keyword as a search term?&amp;quot;.</Paragraph>
    <Paragraph position="4"> Table 1 shows the number of candidate keywords and manually labeled salient keywords for all four test videos.</Paragraph>
    <Paragraph position="5"> As we can see, approximately 50% of candidate key-words were judged to be domain-specific by humans.</Paragraph>
    <Paragraph position="6"> Based on this observation, we selected the top 50% of highly ranked keywords based on the adjusted salience, and examined their presence in the pool of salient key-words for each video. As a result, an average of 82% of salient keywords were identified within these top 50% of re-ranked keywords. In addition, the audiovisual cues improve precision and recall by 1.1% and 1.5% respectively. null videos v1 v2 v3 v4 no. of candidate keywords 477 934 1303 870 no. of salient keywords 253 370 665 363 ratio of salient keywords 53% 40% 51% 42%</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML