Heart Rate Estimation Based on Facial Image Sequence

Recently, remotely obtaining the photoplethysmogram (PPG) signal to estimate the blood volume pulse or the heart rate of a human has become a research topic which gains increasing attention. In contrast to the longstanding contact methods (e.g., electrocardiogram (ECG)), the remote PPG methods can t...

Full description

Saved in:
Bibliographic Details
Published in2020 5th International Conference on Green Technology and Sustainable Development (GTSD) pp. 449 - 453
Main Authors Le, Dao Q., Lie, Wen-Nung, Nhu, Quynh Nguyen Quang, Nguyen, Thu T.A.
Format Conference Proceeding
LanguageEnglish
Published IEEE 27.11.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Recently, remotely obtaining the photoplethysmogram (PPG) signal to estimate the blood volume pulse or the heart rate of a human has become a research topic which gains increasing attention. In contrast to the longstanding contact methods (e.g., electrocardiogram (ECG)), the remote PPG methods can tackle the same task with superior convenience and fewer physical constraints. It has been proved in studies that PPG signal affects the change of color intensity on some body parts such as the face and wrist. Leveraging this, we propose a new method that uses Intel RealSense camera to capture RGB facial videos of human subjects to estimate the heart rate in a short segment of time. By combining a series of image and signal processing techniques, e.g., face detection, facial segmentation, Independent Component Analysis (ICA), filtering, Fast Fourier Transform (FFT), and a new proposed Automatic Component Selection (ACS) algorithm, we are able to accurately estimate the heart rate from the human facial video. Our method works well with slight head motion. The time length of the required facial video is also greatly reduced to about 10 seconds (traditionally, 30~60 seconds). By experiments, we achieved a root mean square error of 3.41 beats per minute (bpm) for 10-seconds RGB videos. This proved the robustness of our new defined region of interest (ROI) for the inputs and proposed ACS can provide. In our future work, shorter video clips (e.g., less than 5 seconds) and tolerance of larger head movements would be achieved so that our system can be well-applied in realistic life.
DOI:10.1109/GTSD50082.2020.9303142