Markov decision processes in practice

This book presents classical Markov Decision Processes (MDP) for real-life applications and optimization. MDP allows users to develop and formally support approximate and simple decision rules, and this book showcases state-of-the-art applications in which MDP was key to the solution approach. The b...

Full description

Saved in:
Bibliographic Details
Other Authors Boucherie, R. J. 1964- (Editor), Dijk, N. M. van (Editor)
Format Electronic eBook
LanguageEnglish
Published Cham, Switzerland : Springer, [2017]
SeriesInternational series in operations research & management science ; v. 248.
Subjects
Online AccessPlný text

Cover

Loading…
Table of Contents:
  • Foreword; Preface; Part I: General Theory; Part II: Healthcare; Part III: Transportation; Part IV: Production; Part V: Communications; Part VI: Financial Modeling; Summarizing; Acknowledgments; Contents; List of Contributors ; Part I General Theory; 1 One-Step Improvement Ideas and Computational Aspects; 1.1 Introduction; 1.2 The Average-Cost Markov Decision Model; 1.2.1 The Concept of Relative Values; 1.2.2 The Policy-Improvement Step; 1.2.3 The Odoni Bounds for Value Iteration; 1.3 Tailor-Made Policy-Iteration Algorithm; 1.3.1 A Queueing Control Problem with a Variable Service Rate.
  • 1.4 One-Step Policy Improvement for Suboptimal Policies; 1.4.1 Dynamic Routing of Customers to Parallel Queues; 1.5 One-Stage-Look-Ahead Rule in Optimal Stopping; 1.5.1 Devil's Penny Problem; 1.5.2 A Game of Dropping Balls into Bins; 1.5.3 The Chow-Robbins Game; References; 2 Value Function Approximation in Complex Queueing Systems; 2.1 Introduction; 2.2 Difference Calculus for Markovian Birth-Death Systems; 2.3 Value Functions for Queueing Systems; 2.3.1 The M/Cox(r)/1 Queue; 2.3.2 Special Cases of the M/Cox(r)/1 Queue; 2.3.3 The M/M/s Queue; 2.3.4 The Blocking Costs in an M/M/s/s Queue.
  • 2.3.5 Priority Queues; 2.4 Application: Routing to Parallel Queues; 2.5 Application: Dynamic Routing in Multiskill Call Centers; 2.6 Application: A Controlled Polling System; References; 3 Approximate Dynamic Programming by Practical Examples; 3.1 Introduction; 3.2 The Nomadic Trucker Example; 3.2.1 Problem Introduction; 3.2.2 MDP Model; 3.2.2.1 State; 3.2.2.2 Decision; 3.2.2.3 Costs; 3.2.2.4 New Information and Transition Function; 3.2.2.5 Solution; 3.2.3 Approximate Dynamic Programming; 3.2.3.1 Post-decision State; 3.2.3.2 Forward Dynamic Programming; 3.2.3.3 Value Function Approximation.
  • 3.3 A Freight Consolidation Example; 3.3.1 Problem Introduction; 3.3.2 MDP Model; 3.3.2.1 State; 3.3.2.2 Decision; 3.3.2.3 Costs; 3.3.2.4 New Information and Transition Function; 3.3.2.5 Solution; 3.3.3 Approximate Dynamic Programming; 3.3.3.1 Post-decision State; 3.3.3.2 Forward Dynamic Programming; 3.3.3.3 Value Function Approximation; 3.4 A Healthcare Example; 3.4.1 Problem Introduction; 3.4.2 MDP Model; 3.4.2.1 State; 3.4.2.2 Decision; 3.4.2.3 Costs; 3.4.2.4 New Information and Transition Function; 3.4.2.5 Solution; 3.4.3 Approximate Dynamic Programming; 3.4.3.1 Post-decision State.
  • 3.4.3.2 Forward Dynamic Programming; 3.4.3.3 Value Function Approximation; 3.5 What's More; 3.5.1 Policies; 3.5.2 Value Function Approximations; 3.5.3 Exploration vs Exploitation; Appendix; References; 4 Server Optimization of Infinite Queueing Systems; 4.1 Introduction; 4.2 Basic Definition and Notations; 4.3 Motivating Examples; 4.3.1 Optimization of a Queueing System with Two Different Servers; 4.3.2 Optimization of a Computational System with Power Saving Mode; 4.3.3 Structural Properties of These Motivating Examples; 4.4 Theoretical Background; 4.4.1 Subset Measures in Markov Chains.