A Differentially Private Approach for Budgeted Combinatorial Multi-Armed Bandits
As a fundamental tool for sequential decision-making, the Combinatorial Multi-Armed Bandits model (CMAB) has been extensively analyzed and applied in various online applications. However, the privacy concerns in budgeted CMAB are rarely investigated thus far. Few bandit algorithms have adequately ad...
Saved in:
Published in | IEEE transactions on dependable and secure computing Vol. 22; no. 1; pp. 424 - 439 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Washington
IEEE
01.01.2025
IEEE Computer Society |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | As a fundamental tool for sequential decision-making, the Combinatorial Multi-Armed Bandits model (CMAB) has been extensively analyzed and applied in various online applications. However, the privacy concerns in budgeted CMAB are rarely investigated thus far. Few bandit algorithms have adequately addressed the privacy-preserving budgeted CMAB setting. Motivated by this, we study this setting using differential privacy as the formal measure of privacy. In this setting, playing an arm yields both a random reward and a random cost, and these values are kept private. In addition, multiple arms can be played in each round. The objective of the decision-maker is to minimize regret while subject to a budget constraint on the cumulative cost of all played arms. We demonstrate an exploration-exploitation-balanced bandit policy, which preserves the privacy of both rewards and costs under budgeted CMAB settings. This policy is proven differentially private and achieves an upper bound on regret. Furthermore, to provide incentives for the differentially private bandit policy so as to ensure that the reported costs are truthful, we introduce the concept of truthfulness and incorporate a payment mechanism that has been proven to be <inline-formula><tex-math notation="LaTeX">\sigma</tex-math> <mml:math><mml:mi>σ</mml:mi></mml:math><inline-graphic xlink:href="cui-ieq1-3401836.gif"/> </inline-formula>-truthful. Numerical simulations based on multiple real-world datasets validate the theoretical findings and demonstrate the effectiveness of our policy compared to state-of-the-art policies. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ISSN: | 1545-5971 1941-0018 |
DOI: | 10.1109/TDSC.2024.3401836 |