A novel data-driven approach for Personas validation in healthcare using self-supervised machine learning

[Display omitted] Persona validation is a challenging task, often relying on costly external validation methods. The aim of this study was the development of a novel method for Personas validation based on data already available during their creation. A novel approach based on self-supervised machin...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biomedical informatics Vol. 165; p. 104815
Main Authors	Tauro, Emanuele, Gorini, Alessandra, Bilo, Grzegorz, Caiani, Enrico Gianluca
Format	Journal Article
Language	English
Published	United States Elsevier Inc 01.05.2025
Subjects	Algorithms Cluster Analysis Clustering Delivery of Health Care Female Humans Machine Learning Male Personalized care Personas Reproducibility of Results Self-supervised machine learning Supervised Machine Learning Surveys and Questionnaires Validation Validation Self-supervised machine learning Personas Personalized care Clustering
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] Persona validation is a challenging task, often relying on costly external validation methods. The aim of this study was the development of a novel method for Personas validation based on data already available during their creation. A novel approach based on self-supervised machine learning (SSML) was proposed. A training-test split was performed (80 % - 20 %), with the training set used for Personas development. The obtained labels were used as input for a 5-fold cross-validation grid search, resulting in 5 optimal different models. The “weak” ground truth for the test set was determined using the trained clustering model, and was compared with the prediction obtained by the majority voting of the optimal models. Performance evaluation was conducted by means of weighted accuracy, precision, recall and F1 score. The proposed method was evaluated on two very different healthcare datasets composed by questionnaires. The former was presented 1070 subjects, resulting in three unbalanced Personas (P0 n = 100; P1 n = 292; P2 n = 464). The latter included 176 subjects with three slightly unbalanced Personas. (P0 n = 58; P1 n = 32; P2 n = 50). The SSML approach resulted capable of correctly differentiating the clusters with high values of weighted accuracy (88.27 % and 94.12 %), precision (87.11 % and 92.83 %), recall (86.92 % and 91.67 %), and F1 score (86.92 % and 91.76 %). The proposed method showed high capabilities in generalization beyond the training data, validating the Personas’ capability of stratifying the characteristics of target populations. Additionally, this method significantly reduced the costs to validate Personas when compared to other methods in current literature.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1532-0464 1532-0480 1532-0480
DOI:	10.1016/j.jbi.2025.104815