A Bayesian approach to video object segmentation via merging 3-D watershed volumes

In this letter, we propose a Bayesian approach to video object segmentation. Our method consists of two stages. In the first stage, we partition the video data into a set of three-dimensional (3-D) watershed volumes, where each watershed volume is a series of corresponding two-dimensional (2-D) imag...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 15; no. 1; pp. 175 - 180
Main Authors TSAI, Yu-Pao, LAI, Chih-Chuan, HUNG, Yi-Ping, SHIH, Zen-Chung
Format Journal Article
LanguageEnglish
Published New York, NY IEEE 01.01.2005
Institute of Electrical and Electronics Engineers
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In this letter, we propose a Bayesian approach to video object segmentation. Our method consists of two stages. In the first stage, we partition the video data into a set of three-dimensional (3-D) watershed volumes, where each watershed volume is a series of corresponding two-dimensional (2-D) image regions. These 2-D image regions are obtained by applying to each image frame the marker-controlled watershed segmentation, where the markers are extracted by first generating a set of initial markers via temporal tracking and then refining the markers with two shrinking schemes: the iterative adaptive erosion and the verification against a presimplified watershed segmentation. Next, in the second stage, we use a Markov random field to model the spatio-temporal relationship among the 3-D watershed volumes that are obtained from the first stage. Then, the desired video objects can be extracted by merging watershed volumes having similar motion characteristics within a Bayesian framework. A major advantage of this method is that it can take into account the global motion information contained in each watershed volume. Our experiments have shown that the proposed method has potential for extracting moving objects from a video sequence.
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2004.839973