Utilizing the random forest algorithm and interpretable machine learning to inform post-stratification of commercial fisheries data

Federal groundfish fisheries off Alaska are managed based on near-real time estimates of catch generated using a combination of data from the North Pacific Groundfish and Pacific Halibut Observer Program, which deploys observers and Electronic Monitoring systems into the fisheries to sample catch, a...

Full description

Saved in:

Bibliographic Details
Published in	Fisheries research Vol. 281; p. 107253
Main Authors	Gasper, Jason, Cahalan, Jennifer
Format	Journal Article
Language	English
Published	Elsevier B.V 01.01.2025
Subjects	Alaska Alaska Fisheries algorithms Catch Estimation demersal fish fisheries Hippoglossus stenolepis Machine Learning management systems Post-Stratification Random Forests species variance Alaska Post-Stratification Random Forests Catch Estimation Alaska Fisheries Machine Learning
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Federal groundfish fisheries off Alaska are managed based on near-real time estimates of catch generated using a combination of data from the North Pacific Groundfish and Pacific Halibut Observer Program, which deploys observers and Electronic Monitoring systems into the fisheries to sample catch, and industry-reported information. Catch is carefully monitored against limits that are based on biological constraints, quota allocations, or to control discard amounts. However, estimates of fish discarded at-sea (not retained for sale) can have large variance due to factors such as fishing behavior, species-specific vulnerability to fishing, and sample sizes. Post-stratification is a statistical approach widely used to improve the precision of catch estimates within a population because it controls for variance while also not relying on covariates known prior to sampling, which can be costly to collect or are unknown. Strategic use of post-stratification may increase the precision of estimates when compared to designs without post-stratification. However, choosing fishery characteristics to define post-strata may be elusive due to the high dimensionality of fishery data and complexity of creating post-strata that are optimized for multiple species. We propose a novel application of random forest classification and design-based estimation to explore multivariate post-stratification designs. These designs were evaluated by selecting the best performing trees from an ensemble using design-based estimation metrics. Results showed a large improvement in the precision of estimates by using the best-performing trees to label data and create post-strata. Moreover, through the use of subject matter expertise to evaluate the best performing trees, this method identified combinations of covariates that were not considered in previous estimation designs, and allows for exploration and testing of alternative post-strata designs that could be implemented in a management system.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	0165-7836
DOI:	10.1016/j.fishres.2024.107253