Assessing methods for multiple imputation of systematic missing data in marine fisheries time series with a new validation algorithm

Time series from fisheries often contain multiple missing data. This is a severe limitation that prevents using the data for research on population dynamics, stock assessment, forecasting, and, hence, decision-making around marine resources. Several methods have been proposed to impute missing data...

Full description

Saved in:
Bibliographic Details
Published inAquaculture and fisheries Vol. 8; no. 5; pp. 587 - 599
Main Authors Benavides, Iván F., Santacruz, Marlon, Romero-Leiton, Jhoana P., Barreto, Carlos, Selvaraj, John Josephraj
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.09.2023
KeAi Communications Co., Ltd
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Time series from fisheries often contain multiple missing data. This is a severe limitation that prevents using the data for research on population dynamics, stock assessment, forecasting, and, hence, decision-making around marine resources. Several methods have been proposed to impute missing data in univariate time series. Still, their performances depend not only on the amount of missing data but also on the data structure. This study compares the performance of twelve imputation methods on the time series of marine fishery landings for six species in the Colombian Pacific Ocean. Unlike other studies, we validate the precision of the imputations in the same target time series that include missing data, using the Known Sub-Sequence Algorithm (KSSA), a novelty validation approach that simulates missing data in known sub-sequences of the target time series. The results showed that the best methods for imputation are Seasonal Decomposition with Kalman filters and Structural Models with Kalman filters fitted by maximum likelihood. Results also show that validating the imputation methods with other time series different to the target time series, leads to wrong imputation methods choices. It is noteworthy that these methods and also the validation framework are mainly suited to time series with non-random distribution of missing data, this is, missing data produced systematically in chunks or clusters with predictable frequency, which are common in marine sciences.
ISSN:2468-550X
2468-550X
DOI:10.1016/j.aaf.2021.12.013