Design and Implementation of a Pipelined Datapath for High-Speed Face Detection Using FPGA

This paper presents design and implementation of a pipelined datapath for real-time face detection using cascades of boosted classifiers. We propose following methods: symmetric image downscaling, classifier sharing, and cascade merging, to achieve the desired processing speed and area efficiency. F...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on industrial informatics Vol. 8; no. 1; pp. 158 - 167
Main Authors	Jin, Seunghun, Kim, Dongkyun, Nguyen, Thuy Tuong, Kim, Daijin, Kim, Munsang, Jeon, Jae Wook
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.02.2012 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Cascades Classifiers Computer architecture Computer vision Delay Face Face detection Field programmable gate arrays field-programmable gate arrays (FPGAs) Hardware integrated circuit design Pyramids Real time systems Streaming media Studies
Online Access	Get full text

Cover

Loading…

More Information
Summary:	This paper presents design and implementation of a pipelined datapath for real-time face detection using cascades of boosted classifiers. We propose following methods: symmetric image downscaling, classifier sharing, and cascade merging, to achieve the desired processing speed and area efficiency. First, an image pyramid with 16 levels is generated from the input image to simultaneously detect faces with different scales. The downscaled images are then transferred to the first stage of the cascade that is shared between the corresponding image pairs based on the pixel validity of the symmetric image pyramid. The last method exploits the different hit ratios of the cascade stages. We use a tree-structured cascade of classifiers since most of the nonface elements are eliminated during the early stages of the classifier. The use of a synthesis tool confirms that the proposed design reduces resource utilization by one-eighth without accuracy loss, compared to the fully parallelized implementation of the same algorithm. We implemented the proposed hardware architecture on a Xilinx Virtex-5 LX330 FPGA. The indicative throughput is 307 frames/s irrespective of the number of faces in the scene for standard VGA (640 × 480) images with an operating frequency of 125.59 MHz. We may ensure that face detection results are generated at each clock cycle after the initial pipeline delay, using this fully pipelined datapath for tree-structured cascade classifiers.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	1551-3203 1941-0050
DOI:	10.1109/TII.2011.2173943