Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning

To demonstrate the incremental benefit of using free text data in addition to vital sign and demographic data to identify patients with suspected infection in the emergency department. This was a retrospective, observational cohort study performed at a tertiary academic teaching hospital. All consec...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 12; no. 4; p. e0174708
Main Authors	Horng, Steven, Sontag, David A, Halpern, Yoni, Jernite, Yacine, Shapiro, Nathan I, Nathanson, Larry A
Format	Journal Article
Language	English
Published	United States Public Library of Science 06.04.2017 Public Library of Science (PLoS)
Subjects	Adult Aged Artificial intelligence Automation Clinical decision making Computer and Information Sciences Computer engineering Computer science Data processing Datasets Decision making Decision support systems Decision Support Systems, Clinical Demographics Emergency medical care Emergency medical services Emergency Service, Hospital - organization & administration Female Health aspects Hospital emergency services Humans Infections Informatics Information processing International Classification of Diseases Learning algorithms Machine Learning Male Management Medical records Medicine Medicine and Health Sciences Middle Aged Nursing Observational studies Patients Physical Sciences Physicians Practice Prediction models Research and Analysis Methods Retrospective Studies Risk factors Sepsis Structured data Support vector machines Teaching Teaching machines Teaching methods Triage - methods Urinary tract infections Vital signs New York United States > US Massachusetts
Online Access	Get full text

Cover

Loading…

More Information
Summary:	To demonstrate the incremental benefit of using free text data in addition to vital sign and demographic data to identify patients with suspected infection in the emergency department. This was a retrospective, observational cohort study performed at a tertiary academic teaching hospital. All consecutive ED patient visits between 12/17/08 and 2/17/13 were included. No patients were excluded. The primary outcome measure was infection diagnosed in the emergency department defined as a patient having an infection related ED ICD-9-CM discharge diagnosis. Patients were randomly allocated to train (64%), validate (20%), and test (16%) data sets. After preprocessing the free text using bigram and negation detection, we built four models to predict infection, incrementally adding vital signs, chief complaint, and free text nursing assessment. We used two different methods to represent free text: a bag of words model and a topic model. We then used a support vector machine to build the prediction model. We calculated the area under the receiver operating characteristic curve to compare the discriminatory power of each model. A total of 230,936 patient visits were included in the study. Approximately 14% of patients had the primary outcome of diagnosed infection. The area under the ROC curve (AUC) for the vitals model, which used only vital signs and demographic data, was 0.67 for the training data set, 0.67 for the validation data set, and 0.67 (95% CI 0.65-0.69) for the test data set. The AUC for the chief complaint model which also included demographic and vital sign data was 0.84 for the training data set, 0.83 for the validation data set, and 0.83 (95% CI 0.81-0.84) for the test data set. The best performing methods made use of all of the free text. In particular, the AUC for the bag-of-words model was 0.89 for training data set, 0.86 for the validation data set, and 0.86 (95% CI 0.85-0.87) for the test data set. The AUC for the topic model was 0.86 for the training data set, 0.86 for the validation data set, and 0.85 (95% CI 0.84-0.86) for the test data set. Compared to previous work that only used structured data such as vital signs and demographic information, utilizing free text drastically improves the discriminatory ability (increase in AUC from 0.67 to 0.86) of identifying infection.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Undefined-3 Competing Interests: The authors have declared that no competing interests exist. Conceptualization: SH DS NS LN.Data curation: SH DS YH YJ.Formal analysis: SH DS YH YJ.Funding acquisition: SH DS NS LN.Investigation: SH DS YH YJ.Methodology: SH DS YH YJ.Software: SH DS YH YJ.Validation: SH DS YH YJ.Writing – original draft: SH DS YH YJ NS LN.Writing – review & editing: SH DS.
ISSN:	1932-6203 1932-6203
DOI:	10.1371/journal.pone.0174708