News video classification based on multi-modal information fusion

A multi-modal information fusion technique integrating the closed caption, anchor's speech, and visual information for TV news video classification is presented. By recognizing closed-caption characters from video, phrases of single- and double-character are found for classification. On the oth...

Full description

Saved in:
Bibliographic Details
Published inIEEE International Conference on Image Processing 2005 Vol. 1; pp. I - 1213
Main Authors Wen-Nung Lie, Chen-Kang Su
Format Conference Proceeding
LanguageEnglish
Published IEEE 2005
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A multi-modal information fusion technique integrating the closed caption, anchor's speech, and visual information for TV news video classification is presented. By recognizing closed-caption characters from video, phrases of single- and double-character are found for classification. On the other hand, content of the anchor's speech signal is not recognized, but instead, labeled with pre-trained cluster means by using a level-building DP (dynamic programming) algorithm. Visual information, including the color and motion features, is extracted from the news footage part for classification. The above three information is individually classified by using statistical relevance factor (RF) or SVM (support vector machine) technique, amounting to 7 different classifiers. Results of multiple classifiers are then combined to get fused outputs by using a modified Bayesian technique. Experiments show that the proposed fusion system is capable of increasing the classification rate by 14% with respect to the best single-modal system. Our Bayesian fusion rule also outperforms the best product rule presented in J. Kittler, et al (1998) by 3%.
ISBN:9780780391345
0780391349
ISSN:1522-4880
2381-8549
DOI:10.1109/ICIP.2005.1529975