Evaluation of statistical and machine learning models using satellite data to estimate aboveground biomass: A study in Vietnam Tropical Forests

The combination of machine learning models with satellite imagery is becoming a popular data-modeling tool for biomass prediction, supporting land cover management. This study aims to select the most suitable model to estimate tropical forest aboveground biomass in Vietnam, helping to manage and mon...

Full description

Saved in:
Bibliographic Details
Published inForest science and technology Vol. 20; no. 4; pp. 370 - 382
Main Authors Nguyen, Thuy Phuong, Nguyen, Phuc Khoa, Nguyen, Huu Ngu, Tran, Thanh Duc, Pham, Gia Tung, Le, Thai Hung, Le, Dinh Huy, Nguyen, Trung Hai, Nguyen, Van Binh
Format Journal Article
LanguageEnglish
Published Seoul Taylor & Francis 01.10.2024
Taylor & Francis Ltd
Taylor & Francis Group
한국산림과학회
Subjects
Online AccessGet full text
ISSN2158-0103
2158-0715
2158-0715
DOI10.1080/21580103.2024.2409211

Cover

More Information
Summary:The combination of machine learning models with satellite imagery is becoming a popular data-modeling tool for biomass prediction, supporting land cover management. This study aims to select the most suitable model to estimate tropical forest aboveground biomass in Vietnam, helping to manage and monitor changes in biomass at regional and local scales. The study identified the optimal model for estimating forest aboveground biomass and minimizing the number of input variables while achieving satisfactory model performance. A total of 59 input variables, including topography, texture features, and vegetation indices, from satellite data were used in four non-parametric algorithms and a conventional parametric model, Artificial Neural Networks (ANN), Support Vector Machine (SVM), Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Multiple Linear Regression (MLR) to predict biomass and evaluate changes aboveground biomass over 10 years in two tropical forests in Vietnam. The results indicated that all models had good estimation performance with R 2 ranging from 0.615 to 0.754. For RF, MLR, and XGBoost, vegetation indices contributed the highest model weights, occupying 77.71% - 92.48%. For ANN and SVM, textural and topographic features were the majority of the model weights (73.74 - 96.36%). The RF model performed the best using 59 variables (R 2 = 0.754, MAE = 78.5 Mg·ha −1 , and %RMSE = 13.57%) and ten variables (R 2 = 0.745, MAE = 85.8 Mg·ha −1 , and %RMSE = 16.17%). The biomass map using the RF and ten variables achieved a good degree of fitting of 0.76, so it was suitable for managing and monitoring forest biomass in Vietnam. The results indicated a sharp decrease in the areas of dense and very dense forests from 2013 to 2021 and a gradual increase in 2023.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
https://www.tandfonline.com/doi/full/10.1080/21580103.2024.2409211
ISSN:2158-0103
2158-0715
2158-0715
DOI:10.1080/21580103.2024.2409211