COVID-19 Genome Sequence Analysis for New Variant Prediction and Generation

The new COVID-19 variants of concern are causing more infections and spreading much faster than their predecessors. Recent cases show that even vaccinated people are highly affected by these new variants. The proactive nucleotide sequence prediction of possible new variants of COVID-19 and developin...

Full description

Saved in:
Bibliographic Details
Published inMathematics (Basel) Vol. 10; no. 22; p. 4267
Main Authors Ullah, Amin, Malik, Khalid Mahmood, Saudagar, Abdul Khader Jilani, Khan, Muhammad Badruddin, Hasanat, Mozaherul Hoque Abul, AlTameem, Abdullah, AlKhathami, Mohammed, Sajjad, Muhammad
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.11.2022
Subjects
Online AccessGet full text

Cover

Loading…
More Information
Summary:The new COVID-19 variants of concern are causing more infections and spreading much faster than their predecessors. Recent cases show that even vaccinated people are highly affected by these new variants. The proactive nucleotide sequence prediction of possible new variants of COVID-19 and developing better healthcare plans to address their spread require a unified framework for variant classification and early prediction. This paper attempts to answer the following research questions: can a convolutional neural network with self-attention by extracting discriminative features from nucleotide sequences be used to classify COVID-19 variants? Second, is it possible to employ uncertainty calculation in the predicted probability distribution to predict new variants? Finally, can synthetic approaches such as variational autoencoder-decoder networks be employed to generate a synthetic new variant from random noise? Experimental results show that the generated sequence is significantly similar to the original coronavirus and its variants, proving that our neural network can learn the mutation patterns from the old variants. Moreover, to our knowledge, we are the first to collect data for all COVID-19 variants for computational analysis. The proposed framework is extensively evaluated for classification, new variant prediction, and new variant generation tasks and achieves better performance for all tasks. Our code, data, and trained models are available on GitHub (https://github.com/Aminullah6264/COVID19, accessed on 16 September 2022).
ISSN:2227-7390
2227-7390
DOI:10.3390/math10224267