File Information

File: 05-lr/acl_arc_1_sum/cleansed_text/xml_by_section/concl/06/n06-2028_concl.xml

Size: 1,149 bytes

Last Modified: 2025-10-06 13:55:15

<?xml version="1.0" standalone="yes"?>
<Paper uid="N06-2028">
  <Title>Extracting Salient Keywords from Instructional Videos Using Joint Text, Audio and Visual Cues</Title>
  <Section position="6" start_page="111" end_page="111" type="concl">
    <SectionTitle>
5 Conclusion and Future Work
</SectionTitle>
    <Paragraph position="0"> We described a mutimodal feature-based system for extracting salient keywords from instructional videos. The system utilizes a richer set of information cues which not only include linguistic and statistical knowledge but also sound classes and characteristic visual content types that are available to videos. Experiments conducted on the DHS videos have shown that incorporating multimodal features for extracting salient keywords from videos is useful.</Paragraph>
    <Paragraph position="1"> Currently, we are performing more sophisticated experiments on different ways to exploit additional audio-visual cues. There is also room for improving the calculation of the incentive values of keywords. Our next plan is to conduct an extensive comparison between GlossEx and the proposed scheme.</Paragraph>
  </Section>
class="xml-element"></Paper>
Download Original XML