MVDet: multi-view multi-class object detection without ground plane assumption

Although many state-of-the-art methods of object detection in a single image have achieved great success in the last few years, they still suffer from the false positives in crowd scenes of the real-world applications like automatic checkout. In order to address the limitations of single-view object...

Full description

Saved in:

Bibliographic Details
Published in	Pattern analysis and applications : PAA Vol. 26; no. 3; pp. 1059 - 1070
Main Authors	Park, Sola, Yang, Seungjin, Lee, Hyuk-Jae
Format	Journal Article
Language	English
Published	London Springer London 01.08.2023 Springer Nature B.V
Subjects	Algorithms Classification Computer Science Ground plane Industrial and Commercial Application Object recognition Pattern Recognition Multi-view object detection Automatic checkout Re-identification Epipolar geometry Multi-view classification
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Although many state-of-the-art methods of object detection in a single image have achieved great success in the last few years, they still suffer from the false positives in crowd scenes of the real-world applications like automatic checkout. In order to address the limitations of single-view object detection in complex scenes, we propose MVDet, an end-to-end learnable approach that can detect and re-identify multi-class objects in multiple images captured by multiple cameras (multi-view). Our approach is based on the premise that incorrect detection results in a specific view can be eliminated using precise cues from other views, given the availability of multi-view images. Unlike most existing multi-view detection algorithms, which assume that objects belong to a single class on the ground plane, our approach can classify multi-class objects without such assumptions and is thus more practical. To classify multi-class objects, we propose an integrated architecture for region proposal, re-identification, and classification. Additionally, we utilize the epipolar geometry constraint to devise a novel re-identification algorithm that does not require assumptions about ground plane assumption. Our model demonstrates competitive performance compared to several baselines on the challenging MessyTable dataset.
ISSN:	1433-7541 1433-755X
DOI:	10.1007/s10044-023-01168-6