Ensemble Modelling of Skipjack Tuna (Katsuwonus pelamis) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models

To examine skipjack tuna’s habitat utilization in the western North Pacific (WNP) we used an ensemble modelling approach, which applied a fisher- derived presence-only dataset and three satellite remote-sensing predictor variables. The skipjack tuna data were compiled from daily point fishing data i...

Full description

Saved in:
Bibliographic Details
Published inRemote sensing (Basel, Switzerland) Vol. 12; no. 16; p. 2591
Main Authors Mugo, Robinson, Saitoh, Sei-Ichi
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.08.2020
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:To examine skipjack tuna’s habitat utilization in the western North Pacific (WNP) we used an ensemble modelling approach, which applied a fisher- derived presence-only dataset and three satellite remote-sensing predictor variables. The skipjack tuna data were compiled from daily point fishing data into monthly composites and re-gridded into a quarter degree resolution to match the environmental predictor variables, the sea surface temperature (SST), sea surface chlorophyll-a (SSC) and sea surface height anomalies (SSHA), which were also processed at quarter degree spatial resolution. Using the sdm package operated in RStudio software, we constructed habitat models over a 9-month period, from March to November 2004, using 17 algorithms, with a 70:30 split of training and test data, with bootstrapping and 10 runs as parameter settings for our models. Model performance evaluation was conducted using the area under the curve (AUC) of the receiver operating characteristic (ROC), the point biserial correlation coefficient (COR), the true skill statistic (TSS) and Cohen’s kappa (k) metrics. We analyzed the response curves for each predictor variable per algorithm, the variable importance information and the ROC plots. Ensemble predictions of habitats were weighted with the TSS metric. Model performance varied across various algorithms, with the Support Vector Machines (SVM), Boosted Regression Trees (BRT), Random Forests (RF), Multivariate Adaptive Regression Splines (MARS), Generalized Additive Models (GAM), Classification and Regression Trees (CART), Multi-Layer Perceptron (MLP), Recursive Partitioning and Regression Trees (RPART), and Maximum Entropy (MAXENT), showing consistently high performance than other algorithms, while the Flexible Discriminant Analysis (FDA), Mixture Discriminant Analysis (MDA), Bioclim (BIOC), Domain (DOM), Maxlike (MAXL), Mahalanobis Distance (MAHA) and Radial Basis Function (RBF) had lower performance. We found inter-algorithm variations in predictor variable responses. We conclude that the multi-algorithm modelling approach enabled us to assess the variability in algorithm performance, hence a data driven basis for building the ensemble model. Given the inter-algorithm variations observed, the ensemble prediction maps indicated a better habitat utilization map of skipjack tuna than would have been achieved by a single algorithm.
ISSN:2072-4292
2072-4292
DOI:10.3390/rs12162591