A Novel Gap-Filling Method Based on Hybrid Read Information Analysis
De novo assembly, which discovers the entire nucleotide sequence by reconstructing the reads resulting from next-generation sequencing, is a subject that must be studied for genetic information analysis. The recombination of reads is performed in several steps, but gaps that cannot be resolved occur...
Saved in:
Published in | 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) pp. 3827 - 3829 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
06.12.2022
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | De novo assembly, which discovers the entire nucleotide sequence by reconstructing the reads resulting from next-generation sequencing, is a subject that must be studied for genetic information analysis. The recombination of reads is performed in several steps, but gaps that cannot be resolved occur even after scaffolding. Gap-filling is performed as the last assembly stage to fill the unidentified regions called gaps, significantly improving overall assembly performance. We propose a gap-filling method using hybrid reads to resolve gaps based on sequence similarity estimation and graph searches. The proposed method consists of three key steps: extracting the candidate sequence, estimating similarity, and filling the gaps based on the graph. Hybrid reads extract sequences with more accurate information, and candidate sequences corresponding to noise are effectively removed based on the similarity estimation. In conclusion, a graph search using statistical information derives a final sequence that guarantees high coverage, resolves gaps, reduces misassemblies, and improves accuracy. |
---|---|
DOI: | 10.1109/BIBM55620.2022.9994889 |