Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scores

ABSTRACT As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually ra...

Full description

Saved in:

Bibliographic Details
Published in	Proteins, structure, function, and bioinformatics Vol. 83; no. 12; pp. 2293 - 2306
Main Authors	Parente, Daniel J., Ray, J. Christian J., Swint-Kruse, Liskin
Format	Journal Article
Language	English
Published	United States Blackwell Publishing Ltd 01.12.2015 Wiley Subscription Services, Inc
Subjects	aldolase Algorithms amino acid Amino acids Amino Acids - chemistry coevolution Entropy Escherichia coli Proteins - chemistry Escherichia coli Proteins - metabolism Evolution, Molecular Fructose-Bisphosphate Aldolase - chemistry Fructose-Bisphosphate Aldolase - metabolism graph theory Lac Repressors - chemistry Lac Repressors - metabolism LacI/GalR Protein Conformation protein evolution Proteins - chemistry Proteins - metabolism Repressor Proteins - chemistry Repressor Proteins - metabolism sequence alignment LacI/GalR protein evolution sequence alignment amino acid graph theory coevolution aldolase
Online Access	Get full text

Cover

Loading…

More Information
Summary:	ABSTRACT As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually rank‐ordered and thresholded to reveal the top pairwise scores, but they also can be treated as weighted networks. Here, we used network analyses to bypass a major complication of coevolution studies: For a given sequence alignment, alternative algorithms usually identify different, top pairwise scores. We reconciled results from five commonly‐used, mathematically divergent algorithms (ELSC, McBASC, OMES, SCA, and ZNMI), using the LacI/GalR and 1,6‐bisphosphate aldolase protein families as models. Calculations used unthresholded coevolution scores from which column‐specific properties such as sequence entropy and random noise were subtracted; “central” positions were identified by calculating various network centrality scores. When compared among algorithms, network centrality methods, particularly eigenvector centrality, showed markedly better agreement than comparisons of the top pairwise scores. Positions with large centrality scores occurred at key structural locations and/or were functionally sensitive to mutations. Further, the top central positions often differed from those with top pairwise coevolution scores: instead of a few strong scores, central positions often had multiple, moderate scores. We conclude that eigenvector centrality calculations reveal a robust evolutionary pattern of constraints—detectable by divergent algorithms—that occur at key protein locations. Finally, we discuss the fact that multiple patterns coexist in evolutionary data that, together, give rise to emergent protein functions. Proteins 2015; 83:2293–2306. © 2015 Wiley Periodicals, Inc.
Bibliography:	University of Kansas Medical Center (Biomedical Research Training Program) istex:09C4EAD06DD802948E201E1FD45D11DCD691F249 ArticleID:PROT24948 K-INBRE (Developmental Research Project award) - No. P20GM103418 ark:/67375/WNG-8C1SH6SD-8 National Institutes of Health - No. GM079423 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0887-3585 1097-0134 1097-0134
DOI:	10.1002/prot.24948