Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects

Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen sc...

Full description

Saved in:
Bibliographic Details
Published inPeerJ (San Francisco, CA) Vol. 13; p. e18927
Main Authors Eijkelboom, Isaak, Schulp, Anne S, Amkreutz, Luc, Verheul, Dylan, Verschoof-van der Vaart, Wouter, van der Vaart-Verschoof, Sasja, Hogeweg, Laurens, Brunink, Django, Mol, Dick, Peeters, Hans, Wesselingh, Frank
Format Journal Article
LanguageEnglish
Published United States PeerJ. Ltd 13.02.2025
PeerJ Inc
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.
AbstractList Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.
Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.
Audience Academic
Author Hogeweg, Laurens
Peeters, Hans
Eijkelboom, Isaak
Verheul, Dylan
Schulp, Anne S
Brunink, Django
Amkreutz, Luc
Mol, Dick
Wesselingh, Frank
Verschoof-van der Vaart, Wouter
van der Vaart-Verschoof, Sasja
Author_xml – sequence: 1
  givenname: Isaak
  surname: Eijkelboom
  fullname: Eijkelboom, Isaak
  organization: Department of Earth Sciences, Utrecht University, Utrecht, Netherlands
– sequence: 2
  givenname: Anne S
  orcidid: 0000-0001-9389-1540
  surname: Schulp
  fullname: Schulp, Anne S
  organization: Department of Earth Sciences, Utrecht University, Utrecht, Netherlands
– sequence: 3
  givenname: Luc
  surname: Amkreutz
  fullname: Amkreutz, Luc
  organization: National Museum of Antiquities, Leiden, Netherlands
– sequence: 4
  givenname: Dylan
  orcidid: 0000-0001-7817-0200
  surname: Verheul
  fullname: Verheul, Dylan
  organization: Observation International, Aarlanderveen, Netherlands
– sequence: 5
  givenname: Wouter
  orcidid: 0000-0002-1053-3009
  surname: Verschoof-van der Vaart
  fullname: Verschoof-van der Vaart, Wouter
  organization: Netherlands Forensic Institute, Den Haag, Netherlands
– sequence: 6
  givenname: Sasja
  surname: van der Vaart-Verschoof
  fullname: van der Vaart-Verschoof, Sasja
  organization: National Museum of Antiquities, Leiden, Netherlands
– sequence: 7
  givenname: Laurens
  orcidid: 0000-0001-6874-5728
  surname: Hogeweg
  fullname: Hogeweg, Laurens
  organization: Naturalis Biodiversity Center, Leiden, Netherlands
– sequence: 8
  givenname: Django
  orcidid: 0000-0002-0731-6636
  surname: Brunink
  fullname: Brunink, Django
  organization: Naturalis Biodiversity Center, Leiden, Netherlands
– sequence: 9
  givenname: Dick
  surname: Mol
  fullname: Mol, Dick
  organization: Natural History Museum Rotterdam, Rotterdam, Netherlands
– sequence: 10
  givenname: Hans
  surname: Peeters
  fullname: Peeters, Hans
  organization: Groningen Institute of Archaeology, University of Groningen, Groningen, Netherlands
– sequence: 11
  givenname: Frank
  surname: Wesselingh
  fullname: Wesselingh, Frank
  organization: Faculty of Science and Engineering, University of Maastricht, Maastricht, Netherlands
BackLink https://www.ncbi.nlm.nih.gov/pubmed/39959835$$D View this record in MEDLINE/PubMed
BookMark eNptUk1vEzEQXaEiWkpP3JElJMQlYW2v114uVVXxUamIC5xX_hgnTjd2sHcbwa_hpzJpCkok7IOteW_emxn7eXUSU4SqeknruZRUvtsA5NWcqo7JJ9UZo62cKS66k4P7aXVRyqrGpVhbK_6sOuVdJzrEzqrfX_RdiAtSIBYgyROfSglDITo6ovMIXtuxvCeaZLgPsN1RDJSRbDICwULBjEzGJRAHJSzijqBJmSxCxU8D2aZ854e0feCttV2GCGQAnSP6zjS6lREcsWEMvyCSYgNEC6ifVoDWL6qnXg8FLh7P8-r7xw_frj_Pbr9-urm-up25hrFxpphpWsENl7Uz2lsFFFqhlTBWOcFq2UljpLDeUWds1yjWGKFU3bRO8VoZfl7d7HVd0qt-k8Na55990qF_CKS86HEcwQ7QAzO8aYxtFUVzAcp45o3rgAlR15aj1uVeazOZNTgLccx6OBI9RmJY9ot031OKxfBWocLbR4Wcfkw4734dioVh0BHSVHqOjysZFXxn9npPXWisLUSfUNLu6P2VYrKRDaUtsub_YeF2sA4W_5QPGD9KeHOQsAQ9jMuShmkMKZZj4qvDZv91-feP8T-rqde_
ContentType Journal Article
Copyright 2025 Eijkelboom et al.
COPYRIGHT 2025 PeerJ. Ltd.
2025 Eijkelboom et al. 2025 Eijkelboom et al.
Copyright_xml – notice: 2025 Eijkelboom et al.
– notice: COPYRIGHT 2025 PeerJ. Ltd.
– notice: 2025 Eijkelboom et al. 2025 Eijkelboom et al.
DBID CGR
CUY
CVF
ECM
EIF
NPM
7X8
5PM
DOA
DOI 10.7717/peerj.18927
DatabaseName Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE

MEDLINE - Academic



Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 2167-8359
ExternalDocumentID oai_doaj_org_article_e2b344bc681d425e8bf2fbd9e25500c3
PMC11830368
A827474116
39959835
Genre Journal Article
Review
GeographicLocations Netherlands
North Sea
GeographicLocations_xml – name: Netherlands
– name: North Sea
GrantInformation_xml – fundername: Open Competitie ENW-M
  grantid: OCENW.M20.360
– fundername: Dutch Research Council (NWO)
GroupedDBID 53G
5VS
88I
8FE
8FH
AAFWJ
ABUWG
ADBBV
ADRAZ
AENEX
AFKRA
AFPKN
ALMA_UNASSIGNED_HOLDINGS
AOIJS
AZQEC
BAWUL
BBNVY
BCNDV
BENPR
BHPHI
BPHCQ
CCPQU
CGR
CUY
CVF
DIK
DWQXO
ECGQY
ECM
EIF
GNUQQ
GROUPED_DOAJ
GX1
H13
HCIFZ
HYE
IAO
IEA
IHR
IHW
ITC
KQ8
LK8
M2P
M48
M7P
M~E
NPM
OK1
PHGZM
PHGZT
PIMPY
PQGLB
PQQKQ
PROAC
RPM
W2D
YAO
PMFND
7X8
5PM
PUEGO
ID FETCH-LOGICAL-d422t-82b4653b370dbafc8e1e65a85bc8d520797bb75cfd1dbc94824b588046d8308b3
IEDL.DBID DOA
ISSN 2167-8359
IngestDate Wed Aug 27 00:59:10 EDT 2025
Thu Aug 21 18:28:59 EDT 2025
Fri Jul 11 03:39:55 EDT 2025
Tue Jun 17 21:58:53 EDT 2025
Tue Jun 10 20:53:52 EDT 2025
Thu May 22 21:24:04 EDT 2025
Mon Jul 21 06:07:15 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords AI
Archaeology
Citizen science
Palaeontology
Project design
Language English
License 2025 Eijkelboom et al.
This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-d422t-82b4653b370dbafc8e1e65a85bc8d520797bb75cfd1dbc94824b588046d8308b3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
ObjectType-Review-3
content type line 23
ORCID 0000-0002-1053-3009
0000-0001-9389-1540
0000-0002-0731-6636
0000-0001-6874-5728
0000-0001-7817-0200
OpenAccessLink https://doaj.org/article/e2b344bc681d425e8bf2fbd9e25500c3
PMID 39959835
PQID 3167721533
PQPubID 23479
ParticipantIDs doaj_primary_oai_doaj_org_article_e2b344bc681d425e8bf2fbd9e25500c3
pubmedcentral_primary_oai_pubmedcentral_nih_gov_11830368
proquest_miscellaneous_3167721533
gale_infotracmisc_A827474116
gale_infotracacademiconefile_A827474116
gale_healthsolutions_A827474116
pubmed_primary_39959835
PublicationCentury 2000
PublicationDate 2025-02-13
PublicationDateYYYYMMDD 2025-02-13
PublicationDate_xml – month: 02
  year: 2025
  text: 2025-02-13
  day: 13
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Diego, USA
PublicationTitle PeerJ (San Francisco, CA)
PublicationTitleAlternate PeerJ
PublicationYear 2025
Publisher PeerJ. Ltd
PeerJ Inc
Publisher_xml – name: PeerJ. Ltd
– name: PeerJ Inc
SSID ssj0000826083
Score 2.3419268
SecondaryResourceType review_article
Snippet Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently,...
SourceID doaj
pubmedcentral
proquest
gale
pubmed
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
StartPage e18927
SubjectTerms Archaeology
Archaeology - methods
Best practices
Biodiversity
Biological diversity
Citizen science
Citizen Science - methods
Citizen scientists
Data Mining and Machine Learning
Fossils
Geospatial data
Human-Computer Interaction
Humans
Information management
Machine Learning
Palaeontology
Paleontology
Paleontology - methods
Project design
Scientists
Workflow
Title Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects
URI https://www.ncbi.nlm.nih.gov/pubmed/39959835
https://www.proquest.com/docview/3167721533
https://pubmed.ncbi.nlm.nih.gov/PMC11830368
https://doaj.org/article/e2b344bc681d425e8bf2fbd9e25500c3
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1ba9RAFB6kgvgi3l1b6xEEn0KT2Uky8a2V1iJsEbGwbyFz04Vttmx2Efw1_al-Z5KWDT744kseMmdhZs7tO9lzEeK9zAGSA4IclcEEKkhwUnmfJtK4TJtpaYziQuHZRXF-qb7M8_nOqC_OCevbA_cXd-SlmSplbAFgBfny2gQZjKs8sHCa2tjnEz5vJ5iKNhioGeCiL8grEbIcXXu_hl3QFc-Pie35_zbCO15onCG543LOHotHA1ak436PT8Q93z4VD2bDv-HPxM0sjpKiDqGop1WgAI-3WHbUtI44V5OrFrqP1FBfoMIkBhug29KoDr9YEyAguZjIwQQNdds4RDFsl8RZW2G5-hXprmLepadh0MSPBLibhcSRXWwWv31Lgzul4fNO91xcnp1-_3SeDBMXEtyt3CRaGu63Bh6lzjTBap_5Im90bqx2uUzLCswrcxtc5oytlJbK5LAAqnB6moK3L8Reu2r9K0HKytS6UmLVqVLiwDJzhUmtcUUZsmoiTpgJ9XXfVKPmNtfxBZhfD8yv_8X8iXjLLKz7mtE7Za2PNQfbEL9iIj5EClZXcNI2Q9UBNsmNr0aUByNKqJkdLb-7FZOalzg3rfWrbVdzLwHE0cDNE_GyF5u7U3HhcAWQOxF6JFCjY49X2sXP2OUbkR_DC_36f1zUvngoeXAxT7KZHoi9zXrr3wBNbcyhuH9yevH122FUIDw_z7M_fBUklw
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Making+sense+of+fossils+and+artefacts%3A+a+review+of+best+practices+for+the+design+of+a+successful+workflow+for+machine+learning-assisted+citizen+science+projects&rft.jtitle=PeerJ+%28San+Francisco%2C+CA%29&rft.au=Eijkelboom%2C+Isaak&rft.au=Schulp%2C+Anne+S&rft.au=Amkreutz%2C+Luc&rft.au=Verheul%2C+Dylan&rft.date=2025-02-13&rft.pub=PeerJ.+Ltd&rft.issn=2167-8359&rft.eissn=2167-8359&rft.volume=13&rft.spage=e18927&rft_id=info:doi/10.7717%2Fpeerj.18927&rft.externalDocID=A827474116
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2167-8359&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2167-8359&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2167-8359&client=summon