SUsPECT: a pipeline for variant effect prediction based on custom long-read transcriptomes for improved clinical variant annotation

Our incomplete knowledge of the human transcriptome impairs the detection of disease-causing variants, in particular if they affect transcripts only expressed under certain conditions. These transcripts are often lacking from reference transcript sets, such as Ensembl/GENCODE and RefSeq, and could b...

Full description

Saved in:
Bibliographic Details
Published inBMC genomics Vol. 24; no. 1; p. 305
Main Authors Salz, Renee, Saraiva-Agostinho, Nuno, Vorsteveld, Emil, van der Made, Caspar I, Kersten, Simone, Stemerdink, Merel, Allen, Jamie, Volders, Pieter-Jan, Hunt, Sarah E, Hoischen, Alexander, 't Hoen, Peter A C
Format Journal Article
LanguageEnglish
Published England BioMed Central Ltd 06.06.2023
BioMed Central
BMC
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Our incomplete knowledge of the human transcriptome impairs the detection of disease-causing variants, in particular if they affect transcripts only expressed under certain conditions. These transcripts are often lacking from reference transcript sets, such as Ensembl/GENCODE and RefSeq, and could be relevant for establishing genetic diagnoses. We present SUsPECT (Solving Unsolved Patient Exomes/gEnomes using Custom Transcriptomes), a pipeline based on the Ensembl Variant Effect Predictor (VEP) to predict variant impact on custom transcript sets, such as those generated by long-read RNA-sequencing, for downstream prioritization. Our pipeline predicts the functional consequence and likely deleteriousness scores for missense variants in the context of novel open reading frames predicted from any transcriptome. We demonstrate the utility of SUsPECT by uncovering potential mutational mechanisms of pathogenic variants in ClinVar that are not predicted to be pathogenic using the reference transcript annotation. In further support of SUsPECT's utility, we identified an enrichment of immune-related variants predicted to have a more severe molecular consequence when annotating with a newly generated transcriptome from stimulated immune cells instead of the reference transcriptome. Our pipeline outputs crucial information for further prioritization of potentially disease-causing variants for any disease and will become increasingly useful as more long-read RNA sequencing datasets become available.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1471-2164
1471-2164
DOI:10.1186/s12864-023-09391-5