Achieving User-Side Fairness in Contextual Bandits
Personalized recommendation based on multi-arm bandit (MAB) algorithms has shown to lead to high utility and efficiency as it can dynamically adapt the recommendation strategy based on feedback. However, unfairness could incur in personalized recommendation. In this paper, we study how to achieve us...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
22.10.2020
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Personalized recommendation based on multi-arm bandit (MAB) algorithms has
shown to lead to high utility and efficiency as it can dynamically adapt the
recommendation strategy based on feedback. However, unfairness could incur in
personalized recommendation. In this paper, we study how to achieve user-side
fairness in personalized recommendation. We formulate our fair personalized
recommendation as a modified contextual bandit and focus on achieving fairness
on the individual whom is being recommended an item as opposed to achieving
fairness on the items that are being recommended. We introduce and define a
metric that captures the fairness in terms of rewards received for both the
privileged and protected groups. We develop a fair contextual bandit algorithm,
Fair-LinUCB, that improves upon the traditional LinUCB algorithm to achieve
group-level fairness of users. Our algorithm detects and monitors unfairness
while it learns to recommend personalized videos to students to achieve high
efficiency. We provide a theoretical regret analysis and show that our
algorithm has a slightly higher regret bound than LinUCB. We conduct numerous
experimental evaluations to compare the performances of our fair contextual
bandit to that of LinUCB and show that our approach achieves group-level
fairness while maintaining a high utility. |
---|---|
DOI: | 10.48550/arxiv.2010.12102 |