Earl Grey: A Fully Automated User-Friendly Transposable Element Annotation and Analysis Pipeline

Abstract Transposable elements (TEs) are major components of eukaryotic genomes and are implicated in a range of evolutionary processes. Yet, TE annotation and characterization remain challenging, particularly for nonspecialists, since existing pipelines are typically complicated to install, run, an...

Full description

Saved in:
Bibliographic Details
Published inMolecular biology and evolution Vol. 41; no. 4
Main Authors Baril, Tobias, Galbraith, James, Hayward, Alex
Format Journal Article
LanguageEnglish
Published UK Oxford University Press 02.04.2024
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:Abstract Transposable elements (TEs) are major components of eukaryotic genomes and are implicated in a range of evolutionary processes. Yet, TE annotation and characterization remain challenging, particularly for nonspecialists, since existing pipelines are typically complicated to install, run, and extract data from. Current methods of automated TE annotation are also subject to issues that reduce overall quality, particularly (i) fragmented and overlapping TE annotations, leading to erroneous estimates of TE count and coverage, and (ii) repeat models represented by short sections of total TE length, with poor capture of 5′ and 3′ ends. To address these issues, we present Earl Grey, a fully automated TE annotation pipeline designed for user-friendly curation and annotation of TEs in eukaryotic genome assemblies. Using nine simulated genomes and an annotation of Drosophila melanogaster, we show that Earl Grey outperforms current widely used TE annotation methodologies in ameliorating the issues mentioned above while scoring highly in benchmarking for TE annotation and classification and being robust across genomic contexts. Earl Grey provides a comprehensive and fully automated TE annotation toolkit that provides researchers with paper-ready summary figures and outputs in standard formats compatible with other bioinformatics tools. Earl Grey has a modular format, with great scope for the inclusion of additional modules focused on further quality control and tailored analyses in future releases.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:0737-4038
1537-1719
1537-1719
DOI:10.1093/molbev/msae068