Evaluating the effect of SARS-CoV-2 spike mutations with a linear doubly robust learner

Driven by various mutations on the viral Spike protein, diverse variants of SARS-CoV-2 have emerged and prevailed repeatedly, significantly prolonging the pandemic. This phenomenon necessitates the identification of key Spike mutations for fitness enhancement. To address the need, this manuscript fo...

Full description

Saved in:

Bibliographic Details
Published in	Frontiers in cellular and infection microbiology Vol. 13; p. 1161445
Main Authors	Wang, Xin, Hu, Mingda, Liu, Bo, Xu, Huifang, Jin, Yuan, Wang, Boqian, Zhao, Yunxiang, Wu, Jun, Yue, Junjie, Ren, Hongguang
Format	Journal Article
Language	English
Published	Switzerland Frontiers Media S.A 19.04.2023
Subjects	basic reproduction number (R0) causal inference Cellular and Infection Microbiology fitness Mutation SARS-CoV-2 SARS-CoV-2 - genetics Spike Glycoprotein, Coronavirus - genetics mutation SARS-CoV-2 causal inference fitness basic reproduction number (R0)
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Driven by various mutations on the viral Spike protein, diverse variants of SARS-CoV-2 have emerged and prevailed repeatedly, significantly prolonging the pandemic. This phenomenon necessitates the identification of key Spike mutations for fitness enhancement. To address the need, this manuscript formulates a well-defined framework of causal inference methods for evaluating and identifying key Spike mutations to the viral fitness of SARS-CoV-2. In the context of large-scale genomes of SARS-CoV-2, it estimates the statistical contribution of mutations to viral fitness across lineages and therefore identifies important mutations. Further, identified key mutations are validated by computational methods to possess functional effects, including Spike stability, receptor-binding affinity, and potential for immune escape. Based on the effect score of each mutation, individual key fitness-enhancing mutations such as D614G and T478K are identified and studied. From individual mutations to protein domains, this paper recognizes key protein regions on the Spike protein, including the receptor-binding domain and the N-terminal domain. This research even makes further efforts to investigate viral fitness mutational effect scores, allowing us to compute the fitness score of different SARS-CoV-2 strains and predict their transmission capacity based solely on their viral sequence. This prediction of viral fitness has been validated using BA.2.12.1, which is not used for regression training but well fits the prediction. To the best of our knowledge, this is the first research to apply causal inference models to mutational analysis on large-scale genomes of SARS-CoV-2. Our findings produce innovative and systematic insights into SARS-CoV-2 and promotes functional studies of its key mutations, serving as reliable guidance about mutations of interest.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Edited by: Jochen Bodem, Julius Maximilian University of Würzburg, Germany These authors have contributed equally to this work Reviewed by: Yicheng Guo, Columbia University, United States; William Buchser, Washington University in St. Louis, United States This article was submitted to Virus and Host, a section of the journal Frontiers in Cellular and Infection Microbiology
ISSN:	2235-2988 2235-2988
DOI:	10.3389/fcimb.2023.1161445