Generation and evaluation of synthetic models for training people detectors
There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating...
Saved in:
Published in | 2017 International Carnahan Conference on Security Technology (ICCST) pp. 1 - 6 |
---|---|
Main Authors | , , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.10.2017
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | There is a large demand in the area of video-surveillance, especially in people detection, which has caused a large increase in the number of researches and resources in this field. As training images and annotations are not always available, it is important to consider the cost involved in creating the detector models. For example, for elderly people detection, the detector must have into account different positions such as standing, sitting, in a wheelchair, etc. Therefore, this work has the main objective of reducing the amount of resources needed to generate the detection model, saving the cost of having to record new sequences and generate the associated annotations for a detector training. To achieve this, three synthetic image datasets have been created in order to train three different models, evaluating which model is optimal and finally analyzing its feasibility by comparing it with a people detector for wheelchair users trained with real images. Other people detection scenarios in which this technique could be applied are, for example, people riding horses or motorbikes, or people carrying supermarket carts. The synthetic datasets have been generated by combining images of standing people with wheelchair images, combining image patches, and segmenting sections of people (trunk, legs, etc.) to add them to the wheelchair image. As expected, the obtained results have a reduction of efficiency (between 21 and 25%) in exchange for the enormous saving in human annotation and resources to record real images. |
---|---|
ISSN: | 2153-0742 |
DOI: | 10.1109/CCST.2017.8167818 |