Information rate of meaningful communication
In Shannon’s seminal paper, the entropy of printed English, treated as a stationary stochastic process, was estimated to be roughly 1 bit per character. However, considered as a means of communication, language differs considerably from its printed form: i) the units of information are not character...
Saved in:
Published in | Proceedings of the National Academy of Sciences - PNAS Vol. 122; no. 25; p. e2502353122 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
United States
National Academy of Sciences
24.06.2025
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | In Shannon’s seminal paper, the entropy of printed English, treated as a stationary stochastic process, was estimated to be roughly 1 bit per character. However, considered as a means of communication, language differs considerably from its printed form: i) the units of information are not characters or even words but clauses, i.e., shortest meaningful parts of speech; and ii) what is transmitted is principally the meaning of what is being said or written, while the precise phrasing that was used to communicate the meaning is typically ignored. In this study, we show that one can leverage recently developed large language models to quantify information communicated in meaningful narratives in terms of bits of meaning per clause. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Edited by Jeffrey Ullman, Stanford University, Stanford, CA; received January 31, 2025; accepted May 19, 2025 |
ISSN: | 0027-8424 1091-6490 1091-6490 |
DOI: | 10.1073/pnas.2502353122 |