Validation of ground truth fire debris classification by supervised machine learning

[Display omitted] •Ground truth fire debris samples prepared in laboratory.•Machine learning classification methods trained on bootstrap samples from ground truth fire debris samples.•Models built on multiple bootstrap samples used to predict an average probability of class membership for validation...

Full description

Saved in:

Bibliographic Details
Published in	Forensic chemistry Vol. 26; p. 100358
Main Authors	Sigman, Michael E., Williams, Mary R., Thurn, Nicholas, Wood, Taylor
Format	Journal Article
Language	English
Published	Elsevier B.V 01.12.2021
Subjects	Classification Fire debris Machine learning Receiver operating characteristics Validation Validation Receiver operating characteristics Fire debris Classification Machine learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	[Display omitted] •Ground truth fire debris samples prepared in laboratory.•Machine learning classification methods trained on bootstrap samples from ground truth fire debris samples.•Models built on multiple bootstrap samples used to predict an average probability of class membership for validation samples.•Models built on multiple bootstrap samples tested on large-scale burn samples. A set of 767 laboratory-generated fire debris samples of known ground truth as to whether an ignitable liquid residue was present (class IL) or absent (class SUB) were used to train five machine learning classifiers. Linear and quadratic discriminant analysis (LDA and QDA), k-nearest neighbors (kNN), and support vector machines with radial and linear kernels (SVMr and SVMl) were tested for their performance in correctly classifying the fire debris samples into class IL or class SUB. Each classifier was trained and tested/validated on 500 class-balanced data sets, each comprised of 400 fire debris samples (200 IL and 200 SUB) that were bootstrapped from the 767 laboratory-generated samples. Each bootstrapped data set was split into subsets for training (75%, 300 samples) and testing/validation (25%, 100 samples). The LDA, SVMr and SVMl were found to give satisfactory performance based on area under the receiver operating characteristic curve (0.86–0.92), equal error rates (17%−22%) and well-calibrated probabilities. The three satisfactory classifiers were further applied to a set of 129 fire debris samples produced in large-scale test burns. The classifications generated by the machine learning models were compared with the sample classes assigned by an informed analyst having knowledge of the chromatographic patterns of the ignitable liquids use to start the large-scale fires. The LDA and SVMl models gave results most closely aligned with the informed analyst.
ISSN:	2468-1709 2468-1709
DOI:	10.1016/j.forc.2021.100358