Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects
Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen sc...
Saved in:
Published in | PeerJ (San Francisco, CA) Vol. 13; p. e18927 |
---|---|
Main Authors | , , , , , , , , , , |
Format | Journal Article |
Language | English |
Published |
United States
PeerJ. Ltd
13.02.2025
PeerJ Inc |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied. |
---|---|
AbstractList | Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied. Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied.Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently, machine learning has emerged as a broadly applicable tool for analysing large datasets of fossils and artefacts. In the digital age, citizen science (CS) and machine learning (ML) prove to be mutually beneficial, and a combined CS-ML approach is increasingly successful in areas such as biodiversity research. Ever-dropping computational costs and the smartphone revolution have put ML tools in the hands of citizen scientists with the potential to generate high-quality data, create new insights from large datasets and elevate public engagement. However, without an integrated approach, new CS-ML projects may not realise the full scientific and public engagement potential. Furthermore, object-based data gathering of fossils and artefacts comes with different requirements for successful CS-ML approaches than observation-based data gathering in biodiversity monitoring. In this review we investigate best practices and common pitfalls in this new interdisciplinary field in order to formulate a workflow to guide future palaeontological and archaeological projects. Our CS-ML workflow is subdivided in four project phases: (I) preparation, (II) execution, (III) implementation and (IV) reiteration. To reach the objectives and manage the challenges for different subject domains (CS tasks, ML development, research, stakeholder engagement and app/infrastructure development), tasks are formulated and allocated to different roles in the project. We also provide an outline for an integrated online CS platform which will help reach a project's full scientific and public engagement potential. Finally, to illustrate the implementation of our CS-ML approach in practice and showcase differences with more commonly available biodiversity CS-ML approaches, we discuss the LegaSea project in which fossils and artefacts from sand nourishments in the western Netherlands are studied. |
Audience | Academic |
Author | Hogeweg, Laurens Peeters, Hans Eijkelboom, Isaak Verheul, Dylan Schulp, Anne S Brunink, Django Amkreutz, Luc Mol, Dick Wesselingh, Frank Verschoof-van der Vaart, Wouter van der Vaart-Verschoof, Sasja |
Author_xml | – sequence: 1 givenname: Isaak surname: Eijkelboom fullname: Eijkelboom, Isaak organization: Department of Earth Sciences, Utrecht University, Utrecht, Netherlands – sequence: 2 givenname: Anne S orcidid: 0000-0001-9389-1540 surname: Schulp fullname: Schulp, Anne S organization: Department of Earth Sciences, Utrecht University, Utrecht, Netherlands – sequence: 3 givenname: Luc surname: Amkreutz fullname: Amkreutz, Luc organization: National Museum of Antiquities, Leiden, Netherlands – sequence: 4 givenname: Dylan orcidid: 0000-0001-7817-0200 surname: Verheul fullname: Verheul, Dylan organization: Observation International, Aarlanderveen, Netherlands – sequence: 5 givenname: Wouter orcidid: 0000-0002-1053-3009 surname: Verschoof-van der Vaart fullname: Verschoof-van der Vaart, Wouter organization: Netherlands Forensic Institute, Den Haag, Netherlands – sequence: 6 givenname: Sasja surname: van der Vaart-Verschoof fullname: van der Vaart-Verschoof, Sasja organization: National Museum of Antiquities, Leiden, Netherlands – sequence: 7 givenname: Laurens orcidid: 0000-0001-6874-5728 surname: Hogeweg fullname: Hogeweg, Laurens organization: Naturalis Biodiversity Center, Leiden, Netherlands – sequence: 8 givenname: Django orcidid: 0000-0002-0731-6636 surname: Brunink fullname: Brunink, Django organization: Naturalis Biodiversity Center, Leiden, Netherlands – sequence: 9 givenname: Dick surname: Mol fullname: Mol, Dick organization: Natural History Museum Rotterdam, Rotterdam, Netherlands – sequence: 10 givenname: Hans surname: Peeters fullname: Peeters, Hans organization: Groningen Institute of Archaeology, University of Groningen, Groningen, Netherlands – sequence: 11 givenname: Frank surname: Wesselingh fullname: Wesselingh, Frank organization: Faculty of Science and Engineering, University of Maastricht, Maastricht, Netherlands |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/39959835$$D View this record in MEDLINE/PubMed |
BookMark | eNptUk1vEzEQXaEiWkpP3JElJMQlYW2v114uVVXxUamIC5xX_hgnTjd2sHcbwa_hpzJpCkok7IOteW_emxn7eXUSU4SqeknruZRUvtsA5NWcqo7JJ9UZo62cKS66k4P7aXVRyqrGpVhbK_6sOuVdJzrEzqrfX_RdiAtSIBYgyROfSglDITo6ovMIXtuxvCeaZLgPsN1RDJSRbDICwULBjEzGJRAHJSzijqBJmSxCxU8D2aZ854e0feCttV2GCGQAnSP6zjS6lREcsWEMvyCSYgNEC6ifVoDWL6qnXg8FLh7P8-r7xw_frj_Pbr9-urm-up25hrFxpphpWsENl7Uz2lsFFFqhlTBWOcFq2UljpLDeUWds1yjWGKFU3bRO8VoZfl7d7HVd0qt-k8Na55990qF_CKS86HEcwQ7QAzO8aYxtFUVzAcp45o3rgAlR15aj1uVeazOZNTgLccx6OBI9RmJY9ot031OKxfBWocLbR4Wcfkw4734dioVh0BHSVHqOjysZFXxn9npPXWisLUSfUNLu6P2VYrKRDaUtsub_YeF2sA4W_5QPGD9KeHOQsAQ9jMuShmkMKZZj4qvDZv91-feP8T-rqde_ |
ContentType | Journal Article |
Copyright | 2025 Eijkelboom et al. COPYRIGHT 2025 PeerJ. Ltd. 2025 Eijkelboom et al. 2025 Eijkelboom et al. |
Copyright_xml | – notice: 2025 Eijkelboom et al. – notice: COPYRIGHT 2025 PeerJ. Ltd. – notice: 2025 Eijkelboom et al. 2025 Eijkelboom et al. |
DBID | CGR CUY CVF ECM EIF NPM 7X8 5PM DOA |
DOI | 10.7717/peerj.18927 |
DatabaseName | Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals |
DatabaseTitle | MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
DatabaseTitleList | MEDLINE MEDLINE - Academic |
Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine |
EISSN | 2167-8359 |
ExternalDocumentID | oai_doaj_org_article_e2b344bc681d425e8bf2fbd9e25500c3 PMC11830368 A827474116 39959835 |
Genre | Journal Article Review |
GeographicLocations | Netherlands North Sea |
GeographicLocations_xml | – name: Netherlands – name: North Sea |
GrantInformation_xml | – fundername: Open Competitie ENW-M grantid: OCENW.M20.360 – fundername: Dutch Research Council (NWO) |
GroupedDBID | 53G 5VS 88I 8FE 8FH AAFWJ ABUWG ADBBV ADRAZ AENEX AFKRA AFPKN ALMA_UNASSIGNED_HOLDINGS AOIJS AZQEC BAWUL BBNVY BCNDV BENPR BHPHI BPHCQ CCPQU CGR CUY CVF DIK DWQXO ECGQY ECM EIF GNUQQ GROUPED_DOAJ GX1 H13 HCIFZ HYE IAO IEA IHR IHW ITC KQ8 LK8 M2P M48 M7P M~E NPM OK1 PHGZM PHGZT PIMPY PQGLB PQQKQ PROAC RPM W2D YAO PMFND 7X8 5PM PUEGO |
ID | FETCH-LOGICAL-d422t-82b4653b370dbafc8e1e65a85bc8d520797bb75cfd1dbc94824b588046d8308b3 |
IEDL.DBID | DOA |
ISSN | 2167-8359 |
IngestDate | Wed Aug 27 00:59:10 EDT 2025 Thu Aug 21 18:28:59 EDT 2025 Fri Jul 11 03:39:55 EDT 2025 Tue Jun 17 21:58:53 EDT 2025 Tue Jun 10 20:53:52 EDT 2025 Thu May 22 21:24:04 EDT 2025 Mon Jul 21 06:07:15 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Keywords | AI Archaeology Citizen science Palaeontology Project design |
Language | English |
License | 2025 Eijkelboom et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-d422t-82b4653b370dbafc8e1e65a85bc8d520797bb75cfd1dbc94824b588046d8308b3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 ObjectType-Review-3 content type line 23 |
ORCID | 0000-0002-1053-3009 0000-0001-9389-1540 0000-0002-0731-6636 0000-0001-6874-5728 0000-0001-7817-0200 |
OpenAccessLink | https://doaj.org/article/e2b344bc681d425e8bf2fbd9e25500c3 |
PMID | 39959835 |
PQID | 3167721533 |
PQPubID | 23479 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_e2b344bc681d425e8bf2fbd9e25500c3 pubmedcentral_primary_oai_pubmedcentral_nih_gov_11830368 proquest_miscellaneous_3167721533 gale_infotracmisc_A827474116 gale_infotracacademiconefile_A827474116 gale_healthsolutions_A827474116 pubmed_primary_39959835 |
PublicationCentury | 2000 |
PublicationDate | 2025-02-13 |
PublicationDateYYYYMMDD | 2025-02-13 |
PublicationDate_xml | – month: 02 year: 2025 text: 2025-02-13 day: 13 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States – name: San Diego, USA |
PublicationTitle | PeerJ (San Francisco, CA) |
PublicationTitleAlternate | PeerJ |
PublicationYear | 2025 |
Publisher | PeerJ. Ltd PeerJ Inc |
Publisher_xml | – name: PeerJ. Ltd – name: PeerJ Inc |
SSID | ssj0000826083 |
Score | 2.3419268 |
SecondaryResourceType | review_article |
Snippet | Historically, the extensive involvement of citizen scientists in palaeontology and archaeology has resulted in many discoveries and insights. More recently,... |
SourceID | doaj pubmedcentral proquest gale pubmed |
SourceType | Open Website Open Access Repository Aggregation Database Index Database |
StartPage | e18927 |
SubjectTerms | Archaeology Archaeology - methods Best practices Biodiversity Biological diversity Citizen science Citizen Science - methods Citizen scientists Data Mining and Machine Learning Fossils Geospatial data Human-Computer Interaction Humans Information management Machine Learning Palaeontology Paleontology Paleontology - methods Project design Scientists Workflow |
Title | Making sense of fossils and artefacts: a review of best practices for the design of a successful workflow for machine learning-assisted citizen science projects |
URI | https://www.ncbi.nlm.nih.gov/pubmed/39959835 https://www.proquest.com/docview/3167721533 https://pubmed.ncbi.nlm.nih.gov/PMC11830368 https://doaj.org/article/e2b344bc681d425e8bf2fbd9e25500c3 |
Volume | 13 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1ba9RAFB6kgvgi3l1b6xEEn0KT2Uky8a2V1iJsEbGwbyFz04Vttmx2Efw1_al-Z5KWDT744kseMmdhZs7tO9lzEeK9zAGSA4IclcEEKkhwUnmfJtK4TJtpaYziQuHZRXF-qb7M8_nOqC_OCevbA_cXd-SlmSplbAFgBfny2gQZjKs8sHCa2tjnEz5vJ5iKNhioGeCiL8grEbIcXXu_hl3QFc-Pie35_zbCO15onCG543LOHotHA1ak436PT8Q93z4VD2bDv-HPxM0sjpKiDqGop1WgAI-3WHbUtI44V5OrFrqP1FBfoMIkBhug29KoDr9YEyAguZjIwQQNdds4RDFsl8RZW2G5-hXprmLepadh0MSPBLibhcSRXWwWv31Lgzul4fNO91xcnp1-_3SeDBMXEtyt3CRaGu63Bh6lzjTBap_5Im90bqx2uUzLCswrcxtc5oytlJbK5LAAqnB6moK3L8Reu2r9K0HKytS6UmLVqVLiwDJzhUmtcUUZsmoiTpgJ9XXfVKPmNtfxBZhfD8yv_8X8iXjLLKz7mtE7Za2PNQfbEL9iIj5EClZXcNI2Q9UBNsmNr0aUByNKqJkdLb-7FZOalzg3rfWrbVdzLwHE0cDNE_GyF5u7U3HhcAWQOxF6JFCjY49X2sXP2OUbkR_DC_36f1zUvngoeXAxT7KZHoi9zXrr3wBNbcyhuH9yevH122FUIDw_z7M_fBUklw |
linkProvider | Directory of Open Access Journals |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Making+sense+of+fossils+and+artefacts%3A+a+review+of+best+practices+for+the+design+of+a+successful+workflow+for+machine+learning-assisted+citizen+science+projects&rft.jtitle=PeerJ+%28San+Francisco%2C+CA%29&rft.au=Eijkelboom%2C+Isaak&rft.au=Schulp%2C+Anne+S&rft.au=Amkreutz%2C+Luc&rft.au=Verheul%2C+Dylan&rft.date=2025-02-13&rft.pub=PeerJ.+Ltd&rft.issn=2167-8359&rft.eissn=2167-8359&rft.volume=13&rft.spage=e18927&rft_id=info:doi/10.7717%2Fpeerj.18927&rft.externalDocID=A827474116 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2167-8359&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2167-8359&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2167-8359&client=summon |