MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition
As in school, one teacher to cover all subjects is insufficient to distill equally robust information to a student. Hence, each subject is taught by a highly specialised teacher. Following a similar philosophy, we propose a multiple specialized teacher framework to distill knowledge to a student net...
Saved in:
Main Authors | , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
29.08.2024
|
Subjects | |
Online Access | Get full text |
DOI | 10.48550/arxiv.2408.16563 |
Cover
Summary: | As in school, one teacher to cover all subjects is insufficient to distill
equally robust information to a student. Hence, each subject is taught by a
highly specialised teacher. Following a similar philosophy, we propose a
multiple specialized teacher framework to distill knowledge to a student
network. In our approach, directed at face recognition use cases, we train four
teachers on one specific ethnicity, leading to four highly specialized and
biased teachers. Our strategy learns a project of these four teachers into a
common space and distill that information to a student network. Our results
highlighted increased performance and reduced bias for all our experiments. In
addition, we further show that having biased/specialized teachers is crucial by
showing that our approach achieves better results than when knowledge is
distilled from four teachers trained on balanced datasets. Our approach
represents a step forward to the understanding of the importance of
ethnicity-specific features. |
---|---|
DOI: | 10.48550/arxiv.2408.16563 |