Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects

Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen sc...

Full description

Saved in:

Bibliographic Details
Published in	PeerJ (San Francisco, CA) Vol. 13; p. e18927
Main Authors	Eijkelboom, Isaak, Schulp, Anne S, Amkreutz, Luc, Verheul, Dylan, Verschoof-van der Vaart, Wouter, van der Vaart-Verschoof, Sasja, Hogeweg, Laurens, Brunink, Django, Mol, Dick, Peeters, Hans, Wesselingh, Frank
Format	Journal Article
Language	English
Published	United States PeerJ. Ltd 13.02.2025 PeerJ Inc
Subjects	Archaeology Archaeology - methods Best practices Biodiversity Biological diversity Citizen science Citizen Science - methods Citizen scientists Data Mining and Machine Learning Fossils Geospatial data Human-Computer Interaction Humans Information management Machine Learning Palaeontology Paleontology Paleontology - methods Project design Scientists Workflow Netherlands North Sea AI Archaeology Citizen science Palaeontology Project design
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 ObjectType-Review-3 content type line 23
ISSN:	2167-8359 2167-8359
DOI:	10.7717/peerj.18927