Privacy-Safe Action Recognition via Cross-Modality Distillation

Human action recognition systems enhance public safety by detecting abnormal behavior autonomously. RGB sensors commonly used in such systems capture personal information of subjects and, as a result, run the risk of potential privacy leakage. On the other hand, privacy-safe alternatives, such as de...

Full description

Saved in:

Bibliographic Details
Published in	IEEE access Vol. 12; pp. 125955 - 125965
Main Authors	Kim, Yuhyun, Jung, Jinwook, Noh, Hyeoncheol, Ahn, Byungtae, Kwon, Junghye, Choi, Dong-Geol
Format	Journal Article
Language	English
Published	Piscataway IEEE 2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Action recognition Availability Benchmark testing Benchmarks Computational modeling Context cross-modality distillation Deep learning Feature extraction Human activity recognition Kinetic theory knowledge distillation multi modal Privacy privacy-safe Public safety Sensors Supervised learning Thermal sensors Training
Online Access	Get full text
ISSN	2169-3536 2169-3536
DOI	10.1109/ACCESS.2024.3431227

Cover

Loading…

More Information
Summary:	Human action recognition systems enhance public safety by detecting abnormal behavior autonomously. RGB sensors commonly used in such systems capture personal information of subjects and, as a result, run the risk of potential privacy leakage. On the other hand, privacy-safe alternatives, such as depth or thermal sensors, exhibit poorer performance because they lack the semantic context provided by RGB sensors. Moreover, the data availability of privacy-safe alternatives is significantly lower than RGB sensors. To address these problems, we explore effective cross-modality distillation methods in this paper, aiming to distill the knowledge of context-rich large-scale pre-trained RGB-based models into privacy-safe depth-based models. Based on extensive experiments on multiple architectures and benchmark datasets, we propose an effective method for training privacy-safe depth-based action recognition models via cross-modality distillation: cross-modality mixing distillation. This approach improves both the performance and efficiency by enabling interaction between depth and RGB modalities through a linear combination of their features. By utilizing the proposed cross-modal mixing distillation approach, we achieve state-of-the-art accuracy in two depth-based action recognition benchmarks. The code and the pre-trained models will be available upon publication.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2169-3536 2169-3536
DOI:	10.1109/ACCESS.2024.3431227