Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
We propose a new framework for formulating optimal transport distances between Markov chains. Previously known formulations studied couplings between the entire joint distribution induced by the chains, and derived solutions via a reduction to dynamic programming (DP) in an appropriately defined Mar...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
06.06.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | We propose a new framework for formulating optimal transport distances
between Markov chains. Previously known formulations studied couplings between
the entire joint distribution induced by the chains, and derived solutions via
a reduction to dynamic programming (DP) in an appropriately defined Markov
decision process. This formulation has, however, not led to particularly
efficient algorithms so far, since computing the associated DP operators
requires fully solving a static optimal transport problem, and these operators
need to be applied numerous times during the overall optimization process. In
this work, we develop an alternative perspective by considering couplings
between a flattened version of the joint distributions that we call discounted
occupancy couplings, and show that calculating optimal transport distances in
the full space of joint distributions can be equivalently formulated as solving
a linear program (LP) in this reduced space. This LP formulation allows us to
port several algorithmic ideas from other areas of optimal transport theory. In
particular, our formulation makes it possible to introduce an appropriate
notion of entropy regularization into the optimization problem, which in turn
enables us to directly calculate optimal transport distances via a
Sinkhorn-like method we call Sinkhorn Value Iteration (SVI). We show both
theoretically and empirically that this method converges quickly to an optimal
coupling, essentially at the same computational cost of running vanilla
Sinkhorn in each pair of states. Along the way, we point out that our optimal
transport distance exactly matches the common notion of bisimulation metrics
between Markov chains, and thus our results also apply to computing such
metrics, and in fact our algorithm turns out to be significantly more efficient
than the best known methods developed so far for this purpose. |
---|---|
DOI: | 10.48550/arxiv.2406.04056 |