Optimizing Long-Term Efficiency and Fairness in Ride-Hailing Under Budget Constraint via Joint Order Dispatching and Driver Repositioning

Ride-hailing platforms (e.g., Uber and Didi Chuxing) have become increasingly popular in recent years. Efficiency has always been an important metric for such platforms. However, only focusing on efficiency inevitably ignores the fairness of driver incomes, which could impair the sustainability of r...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on knowledge and data engineering Vol. 36; no. 7; pp. 3348 - 3362
Main Authors	Sun, Jiahui, Jin, Haiming, Yang, Zhaoxing, Su, Lu
Format	Journal Article
Language	English
Published	New York IEEE 01.07.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms budget constraint Budgets Costs Dispatching Efficiency IEEE merchandise joint order dispatching and driver repositioning long-term efficiency and fairness Measurement Multiagent systems Optimization Platforms Ride-hailing Training Urban areas Vehicles
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Ride-hailing platforms (e.g., Uber and Didi Chuxing) have become increasingly popular in recent years. Efficiency has always been an important metric for such platforms. However, only focusing on efficiency inevitably ignores the fairness of driver incomes, which could impair the sustainability of ride-hailing systems. To optimize such two essential objectives, order dispatching and driver repositioning play an important role, as they impact not only the immediate, but also the future order-serving outcomes of drivers. In practice, the platform offers monetary incentives to drivers for completing the repositioning and has a budget for the repositioning cost. Therefore, in this paper, we aim to exploit joint order dispatching and driver repositioning to optimize both long-term efficiency and fairness in ride-hailing under the budget constraint. To this end, we propose JDRCL, a novel multi-agent reinforcement learning framework, which integrates a group-based action representation that copes with the variable action space, and a primal-dual iterative training algorithm to learn a constraint-satisfying policy that maximizes both the worst and the overall incomes of drivers. Furthermore, we prove the asymptotic convergence rate of our training algorithm. Extensive experiments based on three real-world ride-hailing order datasets show that JDRCL outperforms state-of-the-art baselines on both efficiency and fairness.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1041-4347 1558-2191
DOI:	10.1109/TKDE.2023.3348491