Using applied machine learning to predict healthcare utilization based on socioeconomic determinants of care

To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data. The aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The st...

Full description

Saved in:
Bibliographic Details
Published inThe American journal of managed care Vol. 26; no. 1; pp. 26 - 31
Main Authors Chen, Soy, Bergman, Danielle, Miller, Kelly, Kavanagh, Allison, Frownfelter, John, Showalter, John
Format Journal Article
LanguageEnglish
Published United States 01.01.2020
Subjects
Online AccessGet full text
ISSN1088-0224
1936-2692
1936-2692
DOI10.37765/ajmc.2020.42142

Cover

Loading…
Abstract To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data. The aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The study investigated the ability of machine learning models to predict risk using only publicly available and purchasable SDH data. A total of 138,115 patients were analyzed from a deidentified database representing 3 health systems in the United States. A hold-out methodology was used to ensure that the model's performance could be tested on a completely independent set of subjects. A proprietary decision tree methodology was used to make the predictions. Only the socioeconomic features-age group, gender, and race-were used in the prediction of a patient's risk of admission. The decision tree-based machine learning approach analyzed in this study was able to predict inpatient and emergency department utilization with a high degree of discrimination using only purchasable and publicly available data on SDH. This study indicates that it is possible to risk-stratify patients' risk of utilization without interacting with the patient or collecting information beyond the patient's age, gender, race, and address. The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental SDH at not only the individual level but also the neighborhood level.
AbstractList To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data.OBJECTIVESTo determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data.The aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The study investigated the ability of machine learning models to predict risk using only publicly available and purchasable SDH data. A total of 138,115 patients were analyzed from a deidentified database representing 3 health systems in the United States.STUDY DESIGNThe aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The study investigated the ability of machine learning models to predict risk using only publicly available and purchasable SDH data. A total of 138,115 patients were analyzed from a deidentified database representing 3 health systems in the United States.A hold-out methodology was used to ensure that the model's performance could be tested on a completely independent set of subjects. A proprietary decision tree methodology was used to make the predictions. Only the socioeconomic features-age group, gender, and race-were used in the prediction of a patient's risk of admission.METHODSA hold-out methodology was used to ensure that the model's performance could be tested on a completely independent set of subjects. A proprietary decision tree methodology was used to make the predictions. Only the socioeconomic features-age group, gender, and race-were used in the prediction of a patient's risk of admission.The decision tree-based machine learning approach analyzed in this study was able to predict inpatient and emergency department utilization with a high degree of discrimination using only purchasable and publicly available data on SDH.RESULTSThe decision tree-based machine learning approach analyzed in this study was able to predict inpatient and emergency department utilization with a high degree of discrimination using only purchasable and publicly available data on SDH.This study indicates that it is possible to risk-stratify patients' risk of utilization without interacting with the patient or collecting information beyond the patient's age, gender, race, and address. The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental SDH at not only the individual level but also the neighborhood level.CONCLUSIONSThis study indicates that it is possible to risk-stratify patients' risk of utilization without interacting with the patient or collecting information beyond the patient's age, gender, race, and address. The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental SDH at not only the individual level but also the neighborhood level.
To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data. The aim of this study was to demonstrate the influences of socioeconomic determinants of health (SDH) with regard to avoidable patient-level healthcare utilization. The study investigated the ability of machine learning models to predict risk using only publicly available and purchasable SDH data. A total of 138,115 patients were analyzed from a deidentified database representing 3 health systems in the United States. A hold-out methodology was used to ensure that the model's performance could be tested on a completely independent set of subjects. A proprietary decision tree methodology was used to make the predictions. Only the socioeconomic features-age group, gender, and race-were used in the prediction of a patient's risk of admission. The decision tree-based machine learning approach analyzed in this study was able to predict inpatient and emergency department utilization with a high degree of discrimination using only purchasable and publicly available data on SDH. This study indicates that it is possible to risk-stratify patients' risk of utilization without interacting with the patient or collecting information beyond the patient's age, gender, race, and address. The implications of this application are wide and have the potential to positively affect health systems by facilitating targeted patient outreach with specific, individualized interventions to tackle detrimental SDH at not only the individual level but also the neighborhood level.
Author Kavanagh, Allison
Chen, Soy
Miller, Kelly
Bergman, Danielle
Frownfelter, John
Showalter, John
Author_xml – sequence: 1
  givenname: Soy
  surname: Chen
  fullname: Chen, Soy
– sequence: 2
  givenname: Danielle
  surname: Bergman
  fullname: Bergman, Danielle
  email: Danielle.bergman@jvion.com
  organization: Jvion, 11555 Medlock Bridge Rd, Ste 250, Johns Creek, GA 30114. Email: Danielle.bergman@jvion.com
– sequence: 3
  givenname: Kelly
  surname: Miller
  fullname: Miller, Kelly
– sequence: 4
  givenname: Allison
  surname: Kavanagh
  fullname: Kavanagh, Allison
– sequence: 5
  givenname: John
  surname: Frownfelter
  fullname: Frownfelter, John
– sequence: 6
  givenname: John
  surname: Showalter
  fullname: Showalter, John
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31951356$$D View this record in MEDLINE/PubMed
BookMark eNp1kb1rHDEQxUVwiD-SPpVR6WYv-tjVSWUwtmMwpIlrMSvN5mS00lrSFclfn72z3QRczYN5vwcz75ycpJyQkK-cbeR2q4Zv8DS7jWCCbXrBe_GBnHEjVSeUESerZlp3TIj-lJzX-sSYVLpXn8ip5GbgclBnJD7WkH5TWJYY0NMZ3C4kpBGhpMOiZboU9ME1ukOIbeegIN23EMNfaCEnOkJdwVXU7EJGl1Oeg6MeG5Y5JEit0jzRA_eZfJwgVvzyOi_I4-3Nr-sf3cPPu_vr7w-dE8a0bpKaKW_M4GEQMA7C-O2oJ67Y1mnpvEf0Epjwmq1XO8XGkaNjBlD045ogL8jVS-5S8vMea7NzqA5jhIR5X62QPVdCayNW6-WrdT_O6O1Swgzlj3370GpQLwZXcq0FJ-tCO17eCoRoObPHKuyhCnuowh6rWEH2H_iW_S7yDwlrjgQ
CitedBy_id crossref_primary_10_2217_fon_2021_0302
crossref_primary_10_1097_MLR_0000000000002143
crossref_primary_10_1055_s_0041_1742218
crossref_primary_10_1155_2022_8122895
crossref_primary_10_1136_bmjhci_2024_101065
crossref_primary_10_1016_j_jamda_2023_03_005
crossref_primary_10_1016_j_ssmph_2022_101047
crossref_primary_10_1200_OP_22_00307
crossref_primary_10_1089_pop_2021_0047
crossref_primary_10_1186_s12903_025_05419_2
crossref_primary_10_1016_j_eij_2022_12_005
crossref_primary_10_1186_s12910_024_01158_1
crossref_primary_10_1016_j_ijin_2022_05_002
crossref_primary_10_1055_s_0040_1715827
crossref_primary_10_1177_2378023121999581
crossref_primary_10_1038_s41598_022_08344_4
crossref_primary_10_1136_bmjopen_2021_049259
crossref_primary_10_1016_j_ajog_2023_01_002
crossref_primary_10_1186_s12913_023_09473_w
crossref_primary_10_2196_52244
crossref_primary_10_1016_j_semradonc_2023_06_004
crossref_primary_10_1109_ACCESS_2021_3098691
crossref_primary_10_1186_s13040_024_00387_9
crossref_primary_10_1016_j_scrs_2024_101037
crossref_primary_10_2106_JBJS_OA_20_00128
crossref_primary_10_1007_s10620_022_07506_8
crossref_primary_10_1097_01_NURSE_0000823284_16666_96
crossref_primary_10_1016_j_ecoenv_2022_113271
ContentType Journal Article
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
DOI 10.37765/ajmc.2020.42142
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic
MEDLINE
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Public Health
EISSN 1936-2692
EndPage 31
ExternalDocumentID 31951356
10_37765_ajmc_2020_42142
Genre Research Support, Non-U.S. Gov't
Journal Article
GeographicLocations Alabama
Ohio
Georgia
GeographicLocations_xml – name: Alabama
– name: Georgia
– name: Ohio
GroupedDBID ---
169
23M
2WC
36B
53G
5GY
6J9
6PF
7WY
7X7
AAWTL
AAYXX
ABRAX
ACGFO
ACHQT
ADBBV
AENEX
AHSGN
AKWSX
ALMA_UNASSIGNED_HOLDINGS
ANGHV
BAW
BAWUL
BENPR
BNT
CITATION
DIK
E3Z
EBS
EJD
F5P
GX1
IAO
IHR
INH
INR
M0C
OK1
P2P
P6G
PQQKQ
SJN
TR2
U5U
1CY
1KJ
2KS
7RV
8C1
8FI
8FJ
8FL
8G5
ABUWG
AFKRA
AZQEC
BEZIV
BMSDO
C1A
CCPQU
CGR
CUY
CVF
DWQXO
ECM
EIF
EIHBH
EMOBN
FRNLG
FYUFA
GNUQQ
GUQSH
HMCUK
ITC
M0T
M2O
NAPCQ
NPM
OMK
PHGZT
PQBIZ
PQBZA
UKHRP
ZGI
7X8
ID FETCH-LOGICAL-c299t-f3806d995da52ab529d7b8f1607c83cddeed3a02d80421c60bb1ec09ae24b2993
ISSN 1088-0224
1936-2692
IngestDate Fri Sep 05 03:29:56 EDT 2025
Thu Apr 03 06:57:25 EDT 2025
Thu Apr 24 22:52:20 EDT 2025
Tue Jul 01 04:29:43 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c299t-f3806d995da52ab529d7b8f1607c83cddeed3a02d80421c60bb1ec09ae24b2993
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 31951356
PQID 2341628892
PQPubID 23479
PageCount 6
ParticipantIDs proquest_miscellaneous_2341628892
pubmed_primary_31951356
crossref_citationtrail_10_37765_ajmc_2020_42142
crossref_primary_10_37765_ajmc_2020_42142
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2020-1-1
2020-01-00
20200101
PublicationDateYYYYMMDD 2020-01-01
PublicationDate_xml – month: 01
  year: 2020
  text: 2020-1-1
  day: 01
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle The American journal of managed care
PublicationTitleAlternate Am J Manag Care
PublicationYear 2020
SSID ssj0036846
Score 2.408485
Snippet To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data. The aim of this study was to...
To determine if it is possible to risk-stratify avoidable utilization without clinical data and with limited patient-level data.OBJECTIVESTo determine if it is...
SourceID proquest
pubmed
crossref
SourceType Aggregation Database
Index Database
Enrichment Source
StartPage 26
SubjectTerms Adolescent
Adult
Aged
Alabama - epidemiology
Child
Child, Preschool
Decision Trees
Emergency Service, Hospital - statistics & numerical data
Female
Georgia - epidemiology
Hospitalization - statistics & numerical data
Humans
Infant
Machine Learning
Male
Middle Aged
Ohio - epidemiology
Patient Acceptance of Health Care - statistics & numerical data
Risk
Social Determinants of Health
Socioeconomic Factors
Young Adult
Title Using applied machine learning to predict healthcare utilization based on socioeconomic determinants of care
URI https://www.ncbi.nlm.nih.gov/pubmed/31951356
https://www.proquest.com/docview/2341628892
Volume 26
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9wwEBbb9FIope-mL1TopQRvZPmpYygJoUnTQ3dhb0aW5EBw7LD19tBf3xlJfmzYlrQXY7SWVmg-RiPNfDOEfITJiTTWYSCkkEEcGxnANiwDUZUsj5XiOUNy8teL9HQZf1klq9ns5yRqadOVc_VrJ6_kf6QKbSBXZMn-g2SHQaEB3kG-8AQJw_NOMnb-fukNyWsbF2n6QhCXaFberNER03m6ow3zgvnUnnx5gHuYRn8Byqg1nqR8oG_FyGC_qRW7GOkozTT3hAuF1eP3Nm7AKbbv7UidMOtLf_HqGO71GIA7UBPPoHnocSax9rNzBx3VNUZATa8rOJtcVzgNK6I04KkrgDc3O9q8WnZE-i34eR2b7lL9UZalmCZDXl1jZkrO5jFmkxu3ud61f_GtOFmenxeL49XiHrnP4XhhS36shtCgKM0dK62flnNv2384vDX-tjnzhzOKtVUWj8kjf8igRw4xT8jMNE_JQ3dDSx3x7BmpLXqoRw_16KE9emjXUo8eOqKHTtBDLXoovGyhh07RQ9uKYr_nZHlyvPh8GvjaG4ECA6ULqihnqRYi0TLhsky40FmZV5iOUOWRgk3R6EgyrnPQ-qFKWVmGRjEhDY9LGCF6QfaatjGvCE1MrpVEhnSZ29-qOGdVmEWZYqALqn1y2C9hoXxieqyPUhdwQLWLXuCiF7johV30ffJp6HHjkrL85dsPvVQK0JzoDpONaTc_Cg4GHBbbFvDNSyeuYTTYmJIwStLXd-j9hjwYgf6W7HXrjXkHlmpXvrew-g1urZje
linkProvider Geneva Foundation for Medical Education and Research
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Using+applied+machine+learning+to+predict+healthcare+utilization+based+on+socioeconomic+determinants+of+care&rft.jtitle=The+American+journal+of+managed+care&rft.au=Chen%2C+Soy&rft.au=Bergman%2C+Danielle&rft.au=Miller%2C+Kelly&rft.au=Kavanagh%2C+Allison&rft.date=2020-01-01&rft.issn=1936-2692&rft.eissn=1936-2692&rft.volume=26&rft.issue=1&rft.spage=26&rft_id=info:doi/10.37765%2Fajmc.2020.42142&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1088-0224&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1088-0224&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1088-0224&client=summon