Sievenet: An Efficient Model Utilizing H.265 Codec Structure for Video Object Detection

In the field of video content analysis, object detection is a crucial task. The High Efficient Video Coding (H.265, HEVC) standard's coding structures are strongly correlated with the video content, creating an opportunity to utilize these structures for video object detection in a computationa...

Full description

Saved in:

Bibliographic Details
Published in	2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW) pp. 1 - 5
Main Authors	Koyun, Onur Can, Toreyin, Behcet Ugur
Format	Conference Proceeding
Language	English
Published	IEEE 04.06.2023
Subjects	Codecs Compressed Domain Video Analysis Computational efficiency Computational modeling Deep learning Feature extraction H.265 HEVC Object detection Signal processing Task analysis Video Object Detection
Online Access	Get full text

Cover

Loading…

More Information
Summary:	In the field of video content analysis, object detection is a crucial task. The High Efficient Video Coding (H.265, HEVC) standard's coding structures are strongly correlated with the video content, creating an opportunity to utilize these structures for video object detection in a computationally efficient way. To address this, we present a video object detection method that partitions frames into macroblocks based on the H.265 structure. Blocks with spatially high-frequency content go through a dynamic-layer approach that subjects them to deeper analysis with more layers, while blocks with spatially low-frequency content undergo fewer layers to enable a lower computational load. Results on ImageNet-Vid Dataset indicate that our approach has the potential to save significant computational resources while maintaining accurate object detection performance.
DOI:	10.1109/ICASSPW59220.2023.10193722