Efficient One-Shot Video Object Segmentation

Video object segmentation is the problem of labelling the foreground object of interest that has widespread applications. We reevaluate One-shot Video Object Segmentation (OSVOS), a simple method that adapts VGG to image segmentation using a structure similar to a Fully Convolutional Network. We pro...

Full description

Saved in:
Bibliographic Details
Published in2020 7th NAFOSTED Conference on Information and Computer Science (NICS) pp. 320 - 325
Main Authors Hoang-Xuan, Nhat, Nguyen, E-Ro, Pham-Le, Thuy-Dung, Hoang-Nguyen, Khoi
Format Conference Proceeding
LanguageEnglish
Published IEEE 26.11.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Video object segmentation is the problem of labelling the foreground object of interest that has widespread applications. We reevaluate One-shot Video Object Segmentation (OSVOS), a simple method that adapts VGG to image segmentation using a structure similar to a Fully Convolutional Network. We propose a range of improvements to make OSVOS competitive to newer methods while keeping its simplicity. Specifically, we replace VGG with EfficientNet, and adopt the U-net architecture. We also utilize Focal Loss and Dice Loss to handle the imbalanced binary classification, and finally we remove the boundary snapping module. With our amendments, we achieve 82.4% J&F on DAVIS 2016 validation set, an improvement over the original 80.2% of OSVOS. We also achieve much faster inference time per frame than OSVOS.
DOI:10.1109/NICS51282.2020.9335847