New Fourier-Statistical Features in RGB Space for Video Text Detection

In this paper, we propose new Fourier-statistical features (FSF) in RGB space for detecting text in video frames of unconstrained background, different fonts, different scripts, and different font sizes. This paper consists of two parts namely automatic classification of text frames from a large dat...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on circuits and systems for video technology Vol. 20; no. 11; pp. 1520 - 1532
Main Authors	Shivakumara, P, Trung Quy Phan, Chew Lim Tan
Format	Journal Article
Language	English
Published	New York, NY IEEE 01.11.2010 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Applied sciences Classification Electronic mail Equations Exact sciences and technology Feature based Feature extraction Fonts Fourier statistical features Fourier transforms Frames Image color analysis Image edge detection Information, signal and communications theory K means clustering Pixel Pixels Projection Signal and communications theory Signal representation. Spectral analysis Signal, noise Studies Telecommunications and information theory text detection text frames classification Texts Visualization Automatic classification Fourier transformation False positive Background K means algorithm text frames classification Fourier statistical features Signal classification Aggregation Color space Heuristic method Database Robustness K means clustering text detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, we propose new Fourier-statistical features (FSF) in RGB space for detecting text in video frames of unconstrained background, different fonts, different scripts, and different font sizes. This paper consists of two parts namely automatic classification of text frames from a large database of text and non-text frames and FSF in RGB for text detection in the classified text frames. For text frame classification, we present novel features based on three visual cues, namely, sharpness in filter-edge maps, straightness of the edges, and proximity of the edges to identify a true text frame. For text detection in video frames, we present new Fourier transform based features in RGB space with statistical features and the computed FSF features from RGB bands are subject to K-means clustering to classify text pixels from the background of the frame. Text blocks of the classified text pixels are determined by analyzing the projection profiles. Finally, we introduce a few heuristics to eliminate false positives from the frame. The robustness of the proposed approach is tested by conducting experiments on a variety of frames of low contrast, complex background, different fonts, and sizes of text in the frame. Both our own test dataset and a publicly available dataset are used for the experiments. The experimental results show that the proposed approach is superior to existing approaches in terms of detection rate, false positive rate, and misdetection rate.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2010.2077772