CohortDiagnostics: Phenotype evaluation across a network of observational data sources using population-level characterization

This paper introduces a novel framework for evaluating phenotype algorithms (PAs) using the open-source tool, Cohort Diagnostics. The method is based on several diagnostic criteria to evaluate a patient cohort returned by a PA. Diagnostics include estimates of incidence rate, index date entry code b...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 20; no. 1; p. e0310634
Main Authors Rao, Gowtham A., Shoaibi, Azza, Makadia, Rupa, Hardin, Jill, Swerdel, Joel, Weaver, James, Voss, Erica A., Conover, Mitchell M., Fortin, Stephen, Sena, Anthony G., Knoll, Chris, Hughes, Nigel, Gilbert, James P., Blacketer, Clair, Andryc, Alan, DeFalco, Frank, Molinaro, Anthony, Reps, Jenna, Schuemie, Martijn J., Ryan, Patrick B.
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 16.01.2025
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:This paper introduces a novel framework for evaluating phenotype algorithms (PAs) using the open-source tool, Cohort Diagnostics. The method is based on several diagnostic criteria to evaluate a patient cohort returned by a PA. Diagnostics include estimates of incidence rate, index date entry code breakdown, and prevalence of all observed clinical events prior to, on, and after index date. We test our framework by evaluating one PA for systemic lupus erythematosus (SLE) and two PAs for Alzheimer's disease (AD) across 10 different observational data sources. By utilizing CohortDiagnostics, we found that the population-level characteristics of individuals in the cohort of SLE closely matched the disease's anticipated clinical profile. Specifically, the incidence rate of SLE was consistently higher in occurrence among females. Moreover, expected clinical events like laboratory tests, treatments, and repeated diagnoses were also observed. For AD, although one PA identified considerably fewer patients, absence of notable differences in clinical characteristics between the two cohorts suggested similar specificity. We provide a practical and data-driven approach to evaluate PAs, using two clinical diseases as examples, across a network of OMOP data sources. Cohort Diagnostics can ensure the subjects identified by a specific PA align with those intended for inclusion in a research study. Diagnostics based on large-scale population-level characterization can offer insights into the misclassification errors of PAs.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: All authors were employees of Janssen Research & Development, LLC, and shareholders of Johnson & Johnson (J&J) stock at the time of manuscript conceptualization, drafting, editing, and initial approval for submission. All authors except AM continue to be employees of J&J. Author AM changed affiliation after the initial submission and is currently affiliated with VNS Health. This competing interest does not alter our adherence to PLOS ONE policies on sharing data and materials.
ISSN:1932-6203
1932-6203
DOI:10.1371/journal.pone.0310634