Examining embedded lies through computational text analysis

Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, res...

Full description

Saved in:

Bibliographic Details
Published in	Scientific reports Vol. 15; no. 1; pp. 26482 - 16
Main Authors	Loconte, Riccardo, Kleinberg, Bennett
Format	Journal Article
Language	English
Published	London Nature Publishing Group UK 21.07.2025 Nature Publishing Group Nature Portfolio
Subjects	631/477 631/477/2811 Automation Credibility Deception Embedded lies Humanities and Social Sciences Individual differences Lying Lying profile Machine learning multidisciplinary Narratives Natural Language processing Science Science (multidisciplinary) Semantics Support vector machines Embedded lies Lying profile Natural Language processing Deception Individual differences
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Verbal deception detection research relies on narratives and commonly assumes statements as truthful or deceptive. A more realistic perspective acknowledges that the veracity of statements exists on a continuum, with truthful and deceptive parts being embedded within the same statement. However, research on embedded lies has been lagging behind. We collected a novel dataset of 2,088 truthful and deceptive statements with annotated embedded lies. Using a counterbalanced within-subjects design, participants provided two versions of an autobiographical event. One was described truthfully, and the other one deceptively by including embedded lies. Participants later highlighted those embedded lies and judged them on lie centrality, deceptiveness, and source. We show that a fine-tuned language model (Llama-3-8B) can classify truthful statements and those containing embedded lies significantly above the chance level (64% accuracy). Individual differences, linguistic properties, and explainability analysis suggest that the challenge of moving the dial towards embedded lies stems from their resemblance to truthful statements. Typical deceptive statements consisted of 2/3 truthful information and 1/3 embedded lies, largely derived from past personal experiences and with minimal linguistic differences from their truthful counterparts. We present this dataset as a novel resource to address this challenge and foster research on embedded lies in verbal deception detection.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	2045-2322 2045-2322
DOI:	10.1038/s41598-025-11327-w