Sleep apnea phenotyping and relationship to disease in a large clinical biobank

Sleep apnea is associated with a broad range of pathophysiology. While electronic health record (EHR) information has the potential for revealing relationships between sleep apnea and associated risk factors and outcomes, practical challenges hinder its use. Our objectives were to develop a sleep ap...

Full description

Saved in:
Bibliographic Details
Published inJAMIA open Vol. 5; no. 1; p. ooab117
Main Authors Cade, Brian E, Hassan, Syed Moin, Dashti, Hassan S, Kiernan, Melissa, Pavlova, Milena K, Redline, Susan, Karlson, Elizabeth W
Format Journal Article
LanguageEnglish
Published United States Oxford University Press 01.04.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Sleep apnea is associated with a broad range of pathophysiology. While electronic health record (EHR) information has the potential for revealing relationships between sleep apnea and associated risk factors and outcomes, practical challenges hinder its use. Our objectives were to develop a sleep apnea phenotyping algorithm that improves the precision of EHR case/control information using natural language processing (NLP); identify novel associations between sleep apnea and comorbidities in a large clinical biobank; and investigate the relationship between polysomnography statistics and comorbid disease using NLP phenotyping. We performed clinical chart reviews on 300 participants putatively diagnosed with sleep apnea and applied International Classification of Sleep Disorders criteria to classify true cases and noncases. We evaluated 2 NLP and diagnosis code-only methods for their abilities to maximize phenotyping precision. The lead algorithm was used to identify incident and cross-sectional associations between sleep apnea and common comorbidities using 4876 NLP-defined sleep apnea cases and 3× matched controls. The optimal NLP phenotyping strategy had improved model precision (≥0.943) compared to the use of one diagnosis code (≤0.733). Of the tested diseases, 170 disorders had significant incidence odds ratios (ORs) between cases and controls, 8 of which were confirmed using polysomnography ( = 4544), and 281 disorders had significant prevalence OR between sleep apnea cases versus controls, 41 of which were confirmed using polysomnography data. An NLP-informed algorithm can improve the accuracy of case-control sleep apnea ascertainment and thus improve the performance of phenome-wide, genetic, and other EHR analyses of a highly prevalent disorder.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2574-2531
2574-2531
DOI:10.1093/jamiaopen/ooab117