Hyperspectral remote sensing of plant biochemistry using Bayesian model averaging with variable and band selection

Model specification remains challenging in spectroscopy of plant biochemistry, as exemplified by the availability of various spectral indices or band combinations for estimating the same biochemical. This lack of consensus in model choice across applications argues for a paradigm shift in hyperspect...

Full description

Saved in:
Bibliographic Details
Published inRemote sensing of environment Vol. 132; pp. 102 - 119
Main Authors Zhao, Kaiguang, Valle, Denis, Popescu, Sorin, Zhang, Xuesong, Mallick, Bani
Format Journal Article
LanguageEnglish
Published New York, NY Elsevier Inc 15.05.2013
Elsevier
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Model specification remains challenging in spectroscopy of plant biochemistry, as exemplified by the availability of various spectral indices or band combinations for estimating the same biochemical. This lack of consensus in model choice across applications argues for a paradigm shift in hyperspectral methods to address model uncertainty and misspecification. We demonstrated one such method using Bayesian model averaging (BMA), which performs variable/band selection and quantifies the relative merits of many candidate models to synthesize a weighted average model with improved predictive performances. The utility of BMA was examined using a portfolio of 27 foliage spectral–chemical datasets representing over 80 species across the globe to estimate multiple biochemical properties, including nitrogen, hydrogen, carbon, cellulose, lignin, chlorophyll (a or b), carotenoid, polar and nonpolar extractives, leaf mass per area, and equivalent water thickness. We also compared BMA with partial least squares (PLS) and stepwise multiple regression (SMR). Results showed that all the biochemicals except carotenoid were accurately estimated from hyerspectral data with R2 values>0.80. Compared to PLS and SMR, BMA substantially reduced overfitting and enhanced model generalization; BMA also yielded error estimation better indicative of true uncertainties in predictions, when evaluated using a statistic called “prediction interval coverage probability”. The relative band importance, which was quantified by band selection probability, differed markedly between BMA and SMR, cautioning the use of SMR for band selection. Computationally, the model calibration with datasets of moderate sizes (>100) was faster for BMA via a hybrid reversible-jump Monte Carlo Markov Chain sampler than for PLS via literal optimization of a cross-validation criterion. Our BMA scheme also provides a generic hierarchical Bayesian framework to assimilate prior knowledge of diverse forms, as illustrated by its use to account for nonlinearity in spectral–chemical relationships. We emphasize that BMA is a competitive, paradigm-shifting alternative to conventional statistical methods and it will find wide use as the virtue of Bayesian inference is increasingly appreciated by the remote sensing community. ► Identify useful bands via Bayesian variable selection to predict biochemicals. ► Address model uncertainty and misspecification by Bayesian model averaging (BMA). ► BMA overfits less and generalizes better than PLS/stepwise regression. ► Carotenoid was estimated less accurately than other common biochemicals. ► BMA is a major paradigm shift for general remote sensing statistical inversion.
ISSN:0034-4257
1879-0704
DOI:10.1016/j.rse.2012.12.026