An efficient fully unsupervised video object segmentation scheme using an adaptive neural-network classifier architecture

In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem an...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on neural networks Vol. 14; no. 3; pp. 616 - 630
Main Authors	Doulamis, A., Doulamis, N., Ntalianis, K., Kollias, S.
Format	Journal Article
Language	English
Published	United States IEEE 01.05.2003
Subjects	Adaptive systems Costs Data mining Degradation Face detection Humans Object segmentation Tracking Video sequences Videoconference
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In this paper, an unsupervised video object (VO) segmentation and tracking algorithm is proposed based on an adaptable neural-network architecture. The proposed scheme comprises: 1) a VO tracking module and 2) an initial VO estimation module. Object tracking is handled as a classification problem and implemented through an adaptive network classifier, which provides better results compared to conventional motion-based tracking algorithms. Network adaptation is accomplished through an efficient and cost effective weight updating algorithm, providing a minimum degradation of the previous network knowledge and taking into account the current content conditions. A retraining set is constructed and used for this purpose based on initial VO estimation results. Two different scenarios are investigated. The first concerns extraction of human entities in video conferencing applications, while the second exploits depth information to identify generic VOs in stereoscopic video sequences. Human face/ body detection based on Gaussian distributions is accomplished in the first scenario, while segmentation fusion is obtained using color and depth information in the second scenario. A decision mechanism is also incorporated to detect time instances for weight updating. Experimental results and comparisons indicate the good performance of the proposed scheme even in sequences with complicated content (object bending, occlusion).
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2
ISSN:	1045-9227 1941-0093
DOI:	10.1109/TNN.2003.810605