MERLIN: Identifying Inaccuracies in Multiple Sequence Alignments Using Object Detection
Multiple Sequence Alignments set the basis for many biological sequence analysis methods. However, they are susceptible to irregularities that result either from the predicted sequences or from natural biological events. In this paper, we propose MERLIN (Msa ERror Localization and IdentificatioN), a...
Saved in:
Published in | Artificial Intelligence Applications and Innovations Vol. AICT-646; no. Part I; pp. 192 - 203 |
---|---|
Main Authors | , , , , |
Format | Book Chapter Conference Proceeding |
Language | English |
Published |
Cham
Springer International Publishing
|
Series | IFIP Advances in Information and Communication Technology |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | Multiple Sequence Alignments set the basis for many biological sequence analysis methods. However, they are susceptible to irregularities that result either from the predicted sequences or from natural biological events. In this paper, we propose MERLIN (Msa ERror Localization and IdentificatioN), an object detector that consists in identifying such irregularities using visual representations of MSAs. Our model is developed using a state-of-the-art deep learning object detector, YOLOv4, and trained on a set of MSA images from an in-house built dataset with automatically annotated errors. Our object detector exhibits a mean Average Precision of 71.18% in predicting different types of errors within MSAs. We conducted a thorough examination of the obtained results which showed that our method correctly identifies certain inconsistencies that were missed by the automatic annotation algorithm. |
---|---|
ISBN: | 9783031083327 3031083326 |
ISSN: | 1868-4238 1868-422X |
DOI: | 10.1007/978-3-031-08333-4_16 |