Revisiting linear regression to test agreement in continuous predicted-observed datasets

In agricultural research and related disciplines, using a scatter plot and a regression line to visually and quantitatively assess agreement between model predictions and observed values is an extensively adopted approach, even more within the simulation modeling community. However, linear model fit...

Full description

Saved in:
Bibliographic Details
Published inAgricultural systems Vol. 192; p. 103194
Main Authors Correndo, Adrian A., Hefley, Trevor J., Holzworth, Dean P., Ciampitti, Ignacio A.
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.08.2021
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:In agricultural research and related disciplines, using a scatter plot and a regression line to visually and quantitatively assess agreement between model predictions and observed values is an extensively adopted approach, even more within the simulation modeling community. However, linear model fit, use, and interpretation are still controversial in the literature. The overall goal of this research is to evaluate the usefulness of a symmetric regression line to test agreement on predicted-observed datasets. The specific aims of this study are to: i) discuss the selection of a regression model to fit a line to the predicted-observed scatter, and ii) provide a geometric interpretation of the regression line, decomposing the prediction error into lack of accuracy and lack of precision components, via utilization of illustrative field crop datasets. This study tested and contrasted three alternative linear regression models (Ordinary Least Squares -OLS-, Major Axis -MA-, and Standardized Major Axis -SMA-) in terms of assumptions, loss functions, parameters estimates, and model interpretation for the predicted-observed case. When the uncertainty of predictions and observations are unknown, the SMA represents the most appropriate approach to fit a symmetric-line describing the bivariate predicted-observed scatter. The SMA-line serves as a reference to estimate a weighed difference between predictions and observations. Moreover, this symmetric regression can assist in the decomposition of the square error into additive components related to both lack of accuracy and precision. In summary, the SMA regression tackles the axis orientation problem of the traditional OLS (y vs. x or x or y) and allows to identify error sources that are meaningful to the user. This work offers a novel and simple perspective about the use of linear regression to assess simulation models performance. In order to assist potential users, we also provide a tutorial to compute the proposed assessment of agreement using R-software. [Display omitted] •This work provides a novel perspective on the use of linear regression to test models' performance.•For testing agreement in predicted-observed data, a linear regression model should satisfy symmetry.•With unknown uncertainty of predictions and observations, the standardized major axis regression is recommended.•With standardized major axis regression, we can obtain error components related to both lack of accuracy and precision.•We offer a detailed tutorial assisting users to perform the proposed analysis in R-software.
ISSN:0308-521X
1873-2267
DOI:10.1016/j.agsy.2021.103194