Abstract P1-03-01: An international multicenter study to evaluate reproducibility of automated scoring methods for assessment of Ki67 in breast cancer
Abstract Background: The nuclear proliferation biomarker Ki67 has multiple potential roles in breast cancer, including prognosis-based decisions, but unacceptable between-laboratory variability has limited its clinical value. The International Ki67 Working Group (IKWG) has undertaken a systematic pr...
Saved in:
Published in | Cancer research (Chicago, Ill.) Vol. 77; no. 4_Supplement; pp. P1 - P1-03-01 |
---|---|
Main Authors | , , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
15.02.2017
|
Online Access | Get full text |
Cover
Loading…
Summary: | Abstract
Background: The nuclear proliferation biomarker Ki67 has multiple potential roles in breast cancer, including prognosis-based decisions, but unacceptable between-laboratory variability has limited its clinical value. The International Ki67 Working Group (IKWG) has undertaken a systematic program to determine whether Ki67 immunohistochemistry can be analytically validated and standardized across laboratories. Technological advances and broader availability of devices for automated assessment of stained slides raise the possibility that these machines may improve on reproducibility of traditional pathologist-based visual Ki67 assessment.
Aims: To characterize reproducibility of automated machine-measured Ki67 expression using slides previously analyzed in the IKWG phase 3 study that evaluated reproducibility of visual Ki67 assessment.
Methods: Two sets of 30 previously stained slides containing core-cut biopsy sections of breast tumors were circulated to 14 laboratories for scanning and automated assessment of Ki67 expression. Sites were instructed to return average and maximum percentage of tumor cells positive for Ki67 for each slide, where maximum is designed to reflect “hot spot” analysis. Two laboratories returned scores from 2 operators; not all laboratories reported values for maximum Ki67 scores. Different operators were treated as distinct laboratories in analyses. Sixteen and 10 score sets were available for average and maximum Ki67 analyses, respectively, encompassing 7 unique scanner and 10 software platforms. Pre-specified analyses included evaluation of reproducibility across all laboratories as well as within a subgroup limited to those using Aperio scanners. The primary reproducibility metric was intraclass correlation coefficient between laboratories (ICC), regardless of device platform or software.
Results: Geometric means across 30 cases for 16 operators ranged from 11.06% to 38.11% with overall mean 16.75% (95% CI:14.45-19.42) for average scores. Geometric means for 10 operators ranged from 16.44% to 68.73% with overall mean 25.16% (95% CI: 18.71-33.84) for maximum scores. ICC for automated average scores across 16 operators was 0.83 (95% CI: 0.73-0.91) and ICC for maximum scores across 10 operators was 0.63 (95% CI: 0.44-0.80) although one outlier lab dramatically affected results. For the laboratories using the Aperio platform (8 score sets), ICC for automated average scores was 0.89 (95% CI; 0.81-0.96). These results are similar to ICC of 0.87 (95%CI; 0.81-0.93) reported using these same slides in the Phase 3 visual assessment reproducibility study in which observers counted 500 cells per slide (Leung et al, NPJBrCancer, in press).
Conclusions: Between-laboratory reproducibility for automated machine assessment of average Ki67 is similar to that for pathologist-based visual assessment of Ki67. However, the observed ICC was markedly numerically lower for the maximum score method compared to the average method, suggesting that the maximum score may not be useful as a reproducible measure of proliferation. Automated average scoring methods show promise for standardization of Ki67 scoring, supporting future studies to clinically validate Ki67.
Citation Format: Rimm DL, McShane LM, Leung SCY, Bai Y, Bane AL, Bartlett JMS, Bayani J, Chang MC, Dean M, Denkert C, Enwere E, Galderisi C, Gholap A, Hugh JC, Jadhav A, Kornaga E, Laurinavicius A, Levenson R, Lima J, Miller K, Pantanowitz L, Piper T, Ruan J, Srinivasan M, Virk S, Wu Y, Yang H, Hayes DF, Nielsen TO, Dowsett M. An international multicenter study to evaluate reproducibility of automated scoring methods for assessment of Ki67 in breast cancer [abstract]. In: Proceedings of the 2016 San Antonio Breast Cancer Symposium; 2016 Dec 6-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2017;77(4 Suppl):Abstract nr P1-03-01. |
---|---|
ISSN: | 0008-5472 1538-7445 |
DOI: | 10.1158/1538-7445.SABCS16-P1-03-01 |