The geometry of efficient codes: How rate-distortion trade-offs distort the latent representations of generative models

Living organisms rely on internal models of the world to act adaptively. These models, because of resource limitations, cannot encode every detail and hence need to compress information. From a cognitive standpoint, information compression can manifest as a distortion of latent representations, resu...

Full description

Saved in:

Bibliographic Details
Published in	PLoS computational biology Vol. 21; no. 5; p. e1012952
Main Authors	D’Amato, Leo, Luca Lancia, Gian, Pezzulo, Giovanni
Format	Journal Article
Language	English
Published	United States Public Library of Science 12.05.2025 Public Library of Science (PLoS)
Subjects	Algorithms Biology and Life Sciences Computational Biology Computer and Information Sciences Computer Simulation Engineering and Technology Information theory Machine learning Mathematical research Models, Neurological Physical Sciences Social Sciences Italy
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Living organisms rely on internal models of the world to act adaptively. These models, because of resource limitations, cannot encode every detail and hence need to compress information. From a cognitive standpoint, information compression can manifest as a distortion of latent representations, resulting in the emergence of representations that may not accurately reflect the external world or its geometry. Rate-distortion theory formalizes the optimal way to compress information while minimizing such distortions, by considering factors such as capacity limitations, the frequency and the utility of stimuli. However, while this theory explains why the above factors distort latent representations, it does not specify which specific distortions they produce. To address this question, here we investigate how rate-distortion trade-offs shape the latent representations of images in generative models, specifically Beta Variational Autoencoders ( β -VAEs), under varying constraints of model capacity, data distributions, and task objectives. By systematically exploring these factors, we identify three primary distortions in latent representations: prototypization, specialization, and orthogonalization. These distortions emerge as signatures of information compression, reflecting the model’s adaptation to capacity limitations, data imbalances, and task demands. Additionally, our findings demonstrate that these distortions can coexist, giving rise to a rich landscape of latent spaces, whose geometry could differ significantly across generative models subject to different constraints. Our findings contribute to explain how the normative constraints of rate-distortion theory shape the geometry of latent representations of generative models of artificial systems and living organisms.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 The authors have declared that no competing interests exist.
ISSN:	1553-7358 1553-734X 1553-7358
DOI:	10.1371/journal.pcbi.1012952