Ranking noncanonical 5′ splice site usage by genome-wide RNA-seq analysis and splicing reporter assays

Most human pathogenic mutations in 5′ splice sites affect the canonical GT in positions +1 and +2, leading to noncanonical dinucleotides. On the other hand, noncanonical dinucleotides are observed under physiological conditions in ∼1% of all human 5′ss. It is therefore a challenging task to understa...

Full description

Saved in:
Bibliographic Details
Published inGenome research Vol. 28; no. 12; pp. 1826 - 1840
Main Authors Erkelenz, Steffen, Theiss, Stephan, Kaisers, Wolfgang, Ptok, Johannes, Walotka, Lara, Müller, Lisa, Hillebrand, Frank, Brillen, Anna-Lena, Sladek, Michael, Schaal, Heiner
Format Journal Article
LanguageEnglish
Published United States Cold Spring Harbor Laboratory Press 01.12.2018
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Most human pathogenic mutations in 5′ splice sites affect the canonical GT in positions +1 and +2, leading to noncanonical dinucleotides. On the other hand, noncanonical dinucleotides are observed under physiological conditions in ∼1% of all human 5′ss. It is therefore a challenging task to understand the pathogenic mutation mechanisms underlying the conditions under which noncanonical 5′ss are used. In this work, we systematically examined noncanonical 5′ splice site selection, both experimentally using splicing competition reporters and by analyzing a large RNA-seq data set of 54 fibroblast samples from 27 subjects containing a total of 2.4 billion gapped reads covering 269,375 exon junctions. From both approaches, we consistently derived a noncanonical 5′ss usage ranking GC > TT > AT > GA > GG > CT. In our competition splicing reporter assay, noncanonical splicing was strictly dependent on the presence of upstream or downstream splicing regulatory elements (SREs), and changes in SREs could be compensated by variation of U1 snRNA complementarity in the competing 5′ss. In particular, we could confirm splicing at different positions (i.e., −1, +1, +5) of a splice site for all noncanonical dinucleotides “weaker” than GC. In our comprehensive RNA-seq data set analysis, noncanonical 5′ss were preferentially detected in weakly used exon junctions of highly expressed genes. Among high-confidence splice sites, they were 10-fold overrepresented in clusters with a neighboring, more frequently used 5′ss. Conversely, these more frequently used neighbors contained only the dinucleotides GT, GC, and TT, in accordance with the above ranking.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
These authors are joint first authors and contributed equally to this work.
Present address: Institute for Genetics and Cologne Excellence Cluster on Cellular Stress Responses in Aging-Associated Diseases (CECAD), University of Cologne, 50931 Cologne, Germany
ISSN:1088-9051
1549-5469
1549-5469
DOI:10.1101/gr.235861.118