Clinical benchmark dataset for AI accuracy analysis: quantifying radiographic annotation of pelvic tilt

Radiographic landmark annotation determines patients’ anatomical parameters and influences diagnoses. However, challenges arise from ambiguous region-based definitions, human error, and image quality variations, potentially compromising patient care. Additionally, AI landmark localization often pres...

Full description

Saved in:
Bibliographic Details
Published inScientific data Vol. 11; no. 1; pp. 1162 - 8
Main Authors Chai, Yuan, Boudali, A. Mounir, Maes, Vincent, Walter, William L.
Format Journal Article
LanguageEnglish
Published London Nature Publishing Group UK 22.10.2024
Nature Publishing Group
Nature Portfolio
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Radiographic landmark annotation determines patients’ anatomical parameters and influences diagnoses. However, challenges arise from ambiguous region-based definitions, human error, and image quality variations, potentially compromising patient care. Additionally, AI landmark localization often presents its predictions in a probability-based heatmap format, which lacks a corresponding clinical standard for accuracy validation. This Data Descriptor presents a clinical benchmark dataset for pelvic tilt landmarks, gathered through a probabilistic approach to measure annotation accuracy within clinical environments. A retrospective analysis of 115 pelvic sagittal radiographs was conducted for annotating pelvic tilt parameters by five annotators, revealing landmark cloud sizes of 6.04 mm-17.90 mm at a 95% dataset threshold, corresponding to 9.51°–16.55° maximum angular disagreement in clinical settings. The outcome provides a quantified point cloud dataset for each landmark corresponding to different probabilities, which enables assessment of directional annotation distribution and parameter-wise impact, providing clinical benchmarks. The data is readily reusable for AI studies analyzing the same landmarks, and the method can be easily replicated for establishing clinical accuracy benchmarks of other landmarks.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Undefined-3
ISSN:2052-4463
2052-4463
DOI:10.1038/s41597-024-04003-7