Tximeta: Reference sequence checksums for provenance identification in RNA-seq

Correct annotation metadata is critical for reproducible and accurate RNA-seq analysis. When files are shared publicly or among collaborators with incorrect or missing annotation metadata, it becomes difficult or impossible to reproduce bioinformatic analyses from raw data. It also makes it more dif...

Full description

Saved in:
Bibliographic Details
Published inPLoS computational biology Vol. 16; no. 2; p. e1007664
Main Authors Love, Michael I., Soneson, Charlotte, Hickey, Peter F., Johnson, Lisa K., Pierce, N. Tessa, Shepherd, Lori, Morgan, Martin, Patro, Rob
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 01.02.2020
Public Library of Science (PLoS)
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Correct annotation metadata is critical for reproducible and accurate RNA-seq analysis. When files are shared publicly or among collaborators with incorrect or missing annotation metadata, it becomes difficult or impossible to reproduce bioinformatic analyses from raw data. It also makes it more difficult to locate the transcriptomic features, such as transcripts or genes, in their proper genomic context, which is necessary for overlapping expression data with other datasets. We provide a solution in the form of an R/Bioconductor package tximeta that performs numerous annotation and metadata gathering tasks automatically on behalf of users during the import of transcript quantification files. The correct reference transcriptome is identified via a hashed checksum stored in the quantification output, and key transcript databases are downloaded and cached locally. The computational paradigm of automatically adding annotation metadata based on reference sequence checksums can greatly facilitate genomic workflows, by helping to reduce overhead during bioinformatic analyses, preventing costly bioinformatic mistakes, and promoting computational reproducibility. The tximeta package is available at https://bioconductor.org/packages/tximeta.
Bibliography:new_version
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
I have read the journal’s policy and the authors of this manuscript have the following competing interests: RP is a co-founder of Ocean Genomics.
ISSN:1553-7358
1553-734X
1553-7358
DOI:10.1371/journal.pcbi.1007664