Estimating individual minimum calibration for deep-learning with predictive performance recovery: An example case of gait surface classification from wearable sensor gait data

Clinical datasets often comprise multiple data points or trials sampled from a single participant. When these datasets are used to train machine learning models, the method used to extract train and test sets must be carefully chosen. Using the standard machine learning approach (random-wise split),...

Full description

Saved in:

Bibliographic Details
Published in	Journal of biomechanics Vol. 154; p. 111606
Main Authors	Lam, Guillaume, Rish, Irina, Dixon, Philippe C.
Format	Journal Article
Language	English
Published	United States Elsevier Ltd 01.06.2023 Elsevier Limited
Subjects	Adult Biomechanics Calibration Clinical datasets Clinical trials Data points Datasets Deep Learning Gait Humans Inertial platforms Inertial sensing devices Inter/intra-subject Learning algorithms Machine learning Performance prediction Random/Record-wise split Subject-wise split Test sets Training Walking Wearable computers Wearable Electronic Devices Biomechanics Clinical datasets Gait Inter/intra-subject Machine learning Calibration Subject-wise split Random/Record-wise split
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Clinical datasets often comprise multiple data points or trials sampled from a single participant. When these datasets are used to train machine learning models, the method used to extract train and test sets must be carefully chosen. Using the standard machine learning approach (random-wise split), different trials from the same participant may appear in both training and test sets. This has led to schemes capable of segregating data points from a same participant into a single set (subject-wise split). Past investigations have demonstrated that models trained in this manner underperform compared to those trained using random-split schemes. Additional training of models via a small subset of trials, known as calibration, bridges the gap in performance across split schemes; however, the amount of calibration trials required to achieve strong model performance is unclear. Thus, this study aims to investigate the relationship between calibration training set size and prediction accuracy on the calibration test set. A database of 30 young, healthy adults performing multiple walking trials across nine different surfaces while fit with inertial measurement unit sensors on the lower limbs was used to develop a deep-learning classifier. For subject-wise trained models, calibration on a single gait cycle per surface yielded a 70% increase in F1-score, the harmonic mean of precision and recall, while 10 gait cycles per surface were sufficient to match the performance of a random-wise trained model. Code to generate calibration curves may be found at (https://github.com/GuillaumeLam/PaCalC). •In machine learning, data are split into training and testing sets.•Random-wise split distributes participant data (trials) across both sets.•Subject-wise split ensures trials from a given participant are present in only 1 set.•Calibration improves performance of models trained with a subject-wise split.•Ten gait trials per surface are needed to match performance of random-wise split.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0021-9290 1873-2380 1873-2380
DOI:	10.1016/j.jbiomech.2023.111606