Declarations of Independence: How Embedded Multicollinearity Errors Affect Dosimetric and Other Complex Analyses in Radiation Oncology

The statistical technique of multiple regression, commonly referred to as “multivariable regression,” is often used in clinical research to quantify the relationships between multiple predictor variables and a single outcome variable of interest. The foundational theory underpinning multivariable re...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of radiation oncology, biology, physics Vol. 117; no. 5; pp. 1054 - 1062
Main Authors Ellsworth, Susannah G., van Rossum, Peter S.N., Mohan, Radhe, Lin, Steven H., Grassberger, Clemens, Hobbs, Brian
Format Journal Article
LanguageEnglish
Published United States Elsevier Inc 01.12.2023
Subjects
Online AccessGet full text
ISSN0360-3016
1879-355X
1879-355X
DOI10.1016/j.ijrobp.2023.06.015

Cover

More Information
Summary:The statistical technique of multiple regression, commonly referred to as “multivariable regression,” is often used in clinical research to quantify the relationships between multiple predictor variables and a single outcome variable of interest. The foundational theory underpinning multivariable regression assumes that all predictor variables are independent of one another. In other words, the effect of each independent variable is measured by its contribution to the regression equation while all other variables remain unchanged. In the presence of correlations between two or more variables, however, it is impossible to change one variable without a consequent change in the variable(s) it is linked to. This condition, known as “multicollinearity,” can introduce errors into multivariable regression models by affecting estimates of the regression coefficients that quantify the relationship between individual predictor variables and the outcome variable. Errors that arise due to violations of the multicollinearity assumption are of special interest to radiation oncology researchers. Because of high levels of correlation among variables derived from points along individual organ dose-volume histogram (DVH) curves, as well as strong intercorrelations among dose-volume parameters in neighboring organs, dosimetric analyses are particularly subject to multicollinearity errors. For example, dose-volume parameters for the heart are strongly correlated not only with other points along the heart DVH curve but are likely also correlated with dose-volume parameters in neighboring organs such as the lung. In this paper, we describe the problem of multicollinearity in accessible terms and discuss examples of violations of the multicollinearity assumption within the radiation oncology literature. Finally, we provide recommendations regarding best practices for identifying and managing multicollinearity in complex data sets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0360-3016
1879-355X
1879-355X
DOI:10.1016/j.ijrobp.2023.06.015