ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 3; peer review: 1 approved, 3 approved with reservations]

A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Metho...

Full description

Saved in:
Bibliographic Details
Published inF1000 research Vol. 11; p. 126
Main Authors Mas-Sandoval, Alex, Jin, Chenyu, Fracassetti, Marco, Fumagalli, Matteo
Format Journal Article
LanguageEnglish
Published F1000 Research Ltd 01.01.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicability of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia
Bibliography:new_version
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:2046-1402
2046-1402
DOI:10.12688/f1000research.104368.3