Interrater Reliability of Various Thyroid Imaging Reporting and Data System (TIRADS) Classifications for Differentiating Benign from Malignant Thyroid Nodules

Background: Thyroid ultrasound(US) is used as the first diagnostic tool to assess the management of disease but is operator dependent. There have been few reports evaluating interrater variability in US assessment. Therefore, we evaluated interrater reliability in US assessment of thyroid nodules an...

Full description

Saved in:
Bibliographic Details
Published inAsian Pacific journal of cancer prevention : APJCP Vol. 20; no. 4; pp. 1283 - 1288
Main Authors Phuttharak, Warinthorn, Boonrod, Arunnit, Klungboonkrong, Vivian, Witsawapaisan, Thanatchaporn
Format Journal Article
LanguageEnglish
Published Thailand West Asia Organization for Cancer Prevention 2019
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Background: Thyroid ultrasound(US) is used as the first diagnostic tool to assess the management of disease but is operator dependent. There have been few reports evaluating interrater variability in US assessment. Therefore, we evaluated interrater reliability in US assessment of thyroid nodules and estimated its diagnostic accuracy for various TIRADS systems. Methods: This retrospective study included 24 malignant nodules and 84 benign nodules from January 2015 to October 2017. Two blinded observers independently reviewed stored US images by using TIRADS. All analyses followed guidelines proposed by ACR-TR, Siriraj-TR and EU-TR systems. Interrater reliability was calculated using Cohen’s Kappa statistics. Diagnostic accuracy were also calculated. Results: Interobserver agreement showed substantial agreement for composition (K=0.616); echogenicity and echogenic foci showed fair agreement (K=0.327 and 0.288, respectively); margin showed slight agreement (K=0.143). Interrater reliability for the final assessment; moderate agreement for ACR-TIRADS system (K=0.500); fair agreement for EU-TIRADS system (K=0.209) and slight agreement (K=0.114) for Siriraj-TIRADS system. The diagnostic performance from the two observers; ACRTIRADS system; sensitivities were 75% and 79.2%, specificities were 58.3% and 56%, positive predictive value (PPV) were 34% and 33.9% and negative predictive value (NPV) were 89.1% and 90.4%. For the Siriraj-TIRADS system, sensitivities were 41.7% and 25%, specificities were 84.5% and 89.3%, positive predictive value (PPV) were 43.5% and 40% and negative predictive value (NPV) were 83.5% and 80.6%. For the EU-TIRADS system, sensitivities were 45.8% and 66.7%, specificities were 79.8% and 72.6%, positive predictive value (PPV) were 39.3% and 41% and negative predictive value (NPV) were 83.8% and 88.4%. Conclusion: The ACR-TIRADS had highest interobserver agreement, a trend to have highest sensitivity and negative predictive value for diagnosis of malignant thyroid nodules. Siriraj-TIRADS had higher specificity and accuracy, but lower interobserver agreement.
ISSN:1513-7368
2476-762X
DOI:10.31557/APJCP.2019.20.4.1283