BioAssay Templates for the semantic web

Annotation of bioassay protocols using semantic web vocabulary is a way to make experiment descriptions machine-readable. Protocols are communicated using concise scientific English, which precludes most kinds of analysis by software algorithms. Given the availability of a sufficiently expressive on...

Full description

Saved in:
Bibliographic Details
Published inPeerJ. Computer science Vol. 2; p. e61
Main Authors Clark, Alex M, Litterman, Nadia K, Kranz, Janice E, Gund, Peter, Gregory, Kellan, Bunin, Barry A
Format Journal Article
LanguageEnglish
Published San Diego PeerJ. Ltd 09.05.2016
PeerJ, Inc
PeerJ Inc
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Annotation of bioassay protocols using semantic web vocabulary is a way to make experiment descriptions machine-readable. Protocols are communicated using concise scientific English, which precludes most kinds of analysis by software algorithms. Given the availability of a sufficiently expressive ontology, some or all of the pertinent information can be captured by asserting a series of facts, expressed as semantic web triples (subject, predicate, object). With appropriate annotation, assays can be searched, clustered, tagged and evaluated in a multitude of ways, analogous to other segments of drug discovery informatics. The BioAssay Ontology (BAO) has been previously designed for this express purpose, and provides a layered hierarchy of meaningful terms which can be linked to. Currently the biggest challenge is the issue of content creation: scientists cannot be expected to use the BAO effectively without having access to software tools that make it straightforward to use the vocabulary in a canonical way. We have sought to remove this barrier by: (1) defining a BioAssay Template (BAT) data model; (2) creating a software tool for experts to create or modify templates to suit their needs; and (3) designing a common assay template (CAT) to leverage the most value from the BAO terms. The CAT was carefully assembled by biologists in order to find a balance between the maximum amount of information captured vs. low degrees of freedom in order to keep the user experience as simple as possible. The data format that we use for describing templates and corresponding annotations is the native format of the semantic web (RDF triples), and we demonstrate some of the ways that generated content can be meaningfully queried using the SPARQL language. We have made all of these materials available as open source ( http://github.com/cdd/bioassay-template ), in order to encourage community input and use within diverse projects, including but not limited to our own commercial electronic lab notebook products.
ISSN:2376-5992
2376-5992
DOI:10.7717/peerj-cs.61