MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition

As in school, one teacher to cover all subjects is insufficient to distill equally robust information to a student. Hence, each subject is taught by a highly specialised teacher. Following a similar philosophy, we propose a multiple specialized teacher framework to distill knowledge to a student net...

Full description

Saved in:

Bibliographic Details
Main Authors	Caldeira, Eduarda, Cardoso, Jaime S, Sequeira, Ana F, Neto, Pedro C
Format	Journal Article
Language	English
Published	29.08.2024
Subjects	Computer Science - Computer Vision and Pattern Recognition
Online Access	Get full text
DOI	10.48550/arxiv.2408.16563

Cover

More Information
Summary:	As in school, one teacher to cover all subjects is insufficient to distill equally robust information to a student. Hence, each subject is taught by a highly specialised teacher. Following a similar philosophy, we propose a multiple specialized teacher framework to distill knowledge to a student network. In our approach, directed at face recognition use cases, we train four teachers on one specific ethnicity, leading to four highly specialized and biased teachers. Our strategy learns a project of these four teachers into a common space and distill that information to a student network. Our results highlighted increased performance and reduced bias for all our experiments. In addition, we further show that having biased/specialized teachers is crucial by showing that our approach achieves better results than when knowledge is distilled from four teachers trained on balanced datasets. Our approach represents a step forward to the understanding of the importance of ethnicity-specific features.
DOI:	10.48550/arxiv.2408.16563