The health care and life sciences community profile for dataset descriptions

Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of d...

Full description

Saved in:
Bibliographic Details
Published inPeerJ (San Francisco, CA) Vol. 4; p. e2331
Main Authors Dumontier, Michel, Gray, Alasdair J G, Marshall, M Scott, Alexiev, Vladimir, Ansell, Peter, Bader, Gary, Baran, Joachim, Bolleman, Jerven T, Callahan, Alison, Cruz-Toledo, José, Gaudet, Pascale, Gombocz, Erich A, Gonzalez-Beltran, Alejandra N, Groth, Paul, Haendel, Melissa, Ito, Maori, Jupp, Simon, Juty, Nick, Katayama, Toshiaki, Kobayashi, Norio, Krishnaswami, Kalpana, Laibe, Camille, Le Novère, Nicolas, Lin, Simon, Malone, James, Miller, Michael, Mungall, Christopher J, Rietveld, Laurens, Wimalaratne, Sarala M, Yamaguchi, Atsuko
Format Journal Article
LanguageEnglish
Published United States PeerJ. Ltd 16.08.2016
PeerJ, Inc
PeerJ Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets.
Bibliography:European Union’s Seventh Framework Programme
the BBSRC Institute Strategic Programme
USDOE Office of Science (SC)
European Commission (EC)
Swiss Federal Government State Secretariat for Education, Research and Innovation
Open PHACTS project
AC02-05CH11231; U54 HG008033-01; 115191; FP7/2007-2013; FP7-ICT-2012-6-270253; U41 HG006623; BB/J004456/1
Big Data to Knowledge (BD2K) initiative
and Innovative Medicines Initiative Joint Undertaking
EFPIA companies
Database Center for Life Sciences (DBCLS - Japan)
National Institutes of Health (NIH) National Institute of Allergy and Infectious Diseases (NIAID)
National Bioscience Database Center, Japan
Ministry of Education, Culture, Sports Science and Technology, Japan
ISSN:2167-8359
2167-8359
DOI:10.7717/peerj.2331