A New Multimedia Content Skimming Technique at Arbitrary User-Set Rate Based on Automatic Speech Emphasis Extraction

This article proposes a new technique for skimming multimedia content such as video mail, audio/visual data in blog sites, and other consumer-generated media. The proposed method, which is based on the automatic extraction of emphasized speech, locates emphasized portions of speech with high accurac...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of human-computer interaction Vol. 23; no. 1-2; pp. 115 - 129
Main Authors Hidaka, Kota, Nakajima, Shinya
Format Journal Article
LanguageEnglish
Published Norwood Taylor & Francis Group 01.01.2007
Lawrence Erlbaum Associates, Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This article proposes a new technique for skimming multimedia content such as video mail, audio/visual data in blog sites, and other consumer-generated media. The proposed method, which is based on the automatic extraction of emphasized speech, locates emphasized portions of speech with high accuracy by using prosodic parameters such as pitch, power, and speaking rate. As the method does not employ any speech recognition technique, it enables a highly robust estimation in noisy environments. To extract emphasized portions of speech, the method introduces a metric, "degree of emphasis," which indicates the degree of emphasis of each speech segment. Given an article, the method computes the degree of emphasis for each speech segment in it. When a user requests a skimming of the article's content, the method refers to the user-specified "skimming rate" to collect the emphasized segments. Preference experiments were performed in which participants were asked to select either the skimmed contents created by our method or those created using a fixed interval approach. The preference rate of our method was about 80%, which suggests that the proposed method can generate proper content skimming.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:1044-7318
1532-7590
1044-7318
DOI:10.1080/10447310701363015