Can physicians identify inappropriate nuclear stress tests? An examination of inter-rater reliability for the 2009 appropriate use criteria for radionuclide imaging

We sought to determine inter-rater reliability of the 2009 Appropriate Use Criteria for radionuclide imaging and whether physicians at various levels of training can effectively identify nuclear stress tests with inappropriate indications. Four hundred patients were randomly selected from a consecut...

Full description

Saved in:
Bibliographic Details
Published inCirculation Cardiovascular quality and outcomes Vol. 8; no. 1; pp. 23 - 29
Main Authors Ye, Siqin, Rabbani, LeRoy E, Kelly, Christopher R, Kelly, Maureen R, Lewis, Matthew, Paz, Yehuda, Peck, Clara L, Rao, Shaline, Bokhari, Sabahat, Weiner, Shepard D, Einstein, Andrew J
Format Journal Article
LanguageEnglish
Published United States 01.01.2015
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:We sought to determine inter-rater reliability of the 2009 Appropriate Use Criteria for radionuclide imaging and whether physicians at various levels of training can effectively identify nuclear stress tests with inappropriate indications. Four hundred patients were randomly selected from a consecutive cohort of patients undergoing nuclear stress testing at an academic medical center. Raters with different levels of training (including cardiology attending physicians, cardiology fellows, internal medicine hospitalists, and internal medicine interns) classified individual nuclear stress tests using the 2009 Appropriate Use Criteria. Consensus classification by 2 cardiologists was considered the operational gold standard, and sensitivity and specificity of individual raters for identifying inappropriate tests were calculated. Inter-rater reliability of the Appropriate Use Criteria was assessed using Cohen κ statistics for pairs of different raters. The mean age of patients was 61.5 years; 214 (54%) were female. The cardiologists rated 256 (64%) of 400 nuclear stress tests as appropriate, 68 (18%) as uncertain, 55 (14%) as inappropriate; 21 (5%) tests were unable to be classified. Inter-rater reliability for noncardiologist raters was modest (unweighted Cohen κ, 0.51, 95% confidence interval, 0.45-0.55). Sensitivity of individual raters for identifying inappropriate tests ranged from 47% to 82%, while specificity ranged from 85% to 97%. Inter-rater reliability for the 2009 Appropriate Use Criteria for radionuclide imaging is modest, and there is considerable variation in the ability of raters at different levels of training to identify inappropriate tests.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ISSN:1941-7713
1941-7705
DOI:10.1161/CIRCOUTCOMES.114.001067