The missing link: covalent linkages in structural models

Covalent linkages between constituent blocks of macromolecules and ligands have been subject to inconsistent treatment during the model‐building, refinement and deposition process. This may stem from a number of sources, including difficulties with initially detecting the covalent linkage, identifyi...

Full description

Saved in:
Bibliographic Details
Published inActa crystallographica. Section D, Biological crystallography. Vol. 77; no. 6; pp. 727 - 745
Main Authors Nicholls, Robert A., Wojdyr, Marcin, Joosten, Robbie P., Catapano, Lucrezia, Long, Fei, Fischer, Marcus, Emsley, Paul, Murshudov, Garib N.
Format Journal Article
LanguageEnglish
Published 5 Abbey Square, Chester, Cheshire CH1 2HU, England International Union of Crystallography 01.06.2021
Wiley Subscription Services, Inc
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Covalent linkages between constituent blocks of macromolecules and ligands have been subject to inconsistent treatment during the model‐building, refinement and deposition process. This may stem from a number of sources, including difficulties with initially detecting the covalent linkage, identifying the correct chemistry, obtaining an appropriate restraint dictionary and ensuring its correct application. The analysis presented herein assesses the extent of problems involving covalent linkages in the Protein Data Bank (PDB). Not only will this facilitate the remediation of existing models, but also, more importantly, it will inform and thus improve the quality of future linkages. By considering linkages of known type in the CCP4 Monomer Library (CCP4‐ML), failure to model a covalent linkage is identified to result in inaccurate (systematically longer) interatomic distances. Scanning the PDB for proximal atom pairs that do not have a corresponding type in the CCP4‐ML reveals a large number of commonly occurring types of unannotated potential linkages; in general, these may or may not be covalently linked. Manual consideration of the most commonly occurring cases identifies a number of genuine classes of covalent linkages. The recent expansion of the CCP4‐ML is discussed, which has involved the addition of over 16 000 and the replacement of over 11 000 component dictionaries using AceDRG. As part of this effort, the CCP4‐ML has also been extended using AceDRG link dictionaries for the aforementioned linkage types identified in this analysis. This will facilitate the identification of such linkage types in future modelling efforts, whilst concurrently easing the process involved in their application. The need for a universal standard for maintaining link records corresponding to covalent linkages, and references to the associated dictionaries used during modelling and refinement, following deposition to the PDB is emphasized. The importance of correctly modelling covalent linkages is demonstrated using a case study, which involves the covalent linkage of an inhibitor to the main protease in various viral species, including SARS‐CoV‐2. This example demonstrates the importance of properly modelling covalent linkages using a comprehensive restraint dictionary, as opposed to just using a single interatomic distance restraint or failing to model the covalent linkage at all. Analysis of the Protein Data Bank revealed that over a third of entries contain covalent linkages without descriptions in the CCP4 Monomer Library (CCP4‐ML). The CCP4‐ML was updated with AceDRG dictionaries corresponding to commonly occurring classes of missing linkages.
AbstractList Covalent linkages between constituent blocks of macromolecules and ligands have been subject to inconsistent treatment during the model‐building, refinement and deposition process. This may stem from a number of sources, including difficulties with initially detecting the covalent linkage, identifying the correct chemistry, obtaining an appropriate restraint dictionary and ensuring its correct application. The analysis presented herein assesses the extent of problems involving covalent linkages in the Protein Data Bank (PDB). Not only will this facilitate the remediation of existing models, but also, more importantly, it will inform and thus improve the quality of future linkages. By considering linkages of known type in the CCP4 Monomer Library (CCP4‐ML), failure to model a covalent linkage is identified to result in inaccurate (systematically longer) interatomic distances. Scanning the PDB for proximal atom pairs that do not have a corresponding type in the CCP4‐ML reveals a large number of commonly occurring types of unannotated potential linkages; in general, these may or may not be covalently linked. Manual consideration of the most commonly occurring cases identifies a number of genuine classes of covalent linkages. The recent expansion of the CCP4‐ML is discussed, which has involved the addition of over 16 000 and the replacement of over 11 000 component dictionaries using AceDRG. As part of this effort, the CCP4‐ML has also been extended using AceDRG link dictionaries for the aforementioned linkage types identified in this analysis. This will facilitate the identification of such linkage types in future modelling efforts, whilst concurrently easing the process involved in their application. The need for a universal standard for maintaining link records corresponding to covalent linkages, and references to the associated dictionaries used during modelling and refinement, following deposition to the PDB is emphasized. The importance of correctly modelling covalent linkages is demonstrated using a case study, which involves the covalent linkage of an inhibitor to the main protease in various viral species, including SARS‐CoV‐2. This example demonstrates the importance of properly modelling covalent linkages using a comprehensive restraint dictionary, as opposed to just using a single interatomic distance restraint or failing to model the covalent linkage at all.
Covalent linkages between constituent blocks of macromolecules and ligands have been subject to inconsistent treatment during the model‐building, refinement and deposition process. This may stem from a number of sources, including difficulties with initially detecting the covalent linkage, identifying the correct chemistry, obtaining an appropriate restraint dictionary and ensuring its correct application. The analysis presented herein assesses the extent of problems involving covalent linkages in the Protein Data Bank (PDB). Not only will this facilitate the remediation of existing models, but also, more importantly, it will inform and thus improve the quality of future linkages. By considering linkages of known type in the CCP4 Monomer Library (CCP4‐ML), failure to model a covalent linkage is identified to result in inaccurate (systematically longer) interatomic distances. Scanning the PDB for proximal atom pairs that do not have a corresponding type in the CCP4‐ML reveals a large number of commonly occurring types of unannotated potential linkages; in general, these may or may not be covalently linked. Manual consideration of the most commonly occurring cases identifies a number of genuine classes of covalent linkages. The recent expansion of the CCP4‐ML is discussed, which has involved the addition of over 16 000 and the replacement of over 11 000 component dictionaries using AceDRG. As part of this effort, the CCP4‐ML has also been extended using AceDRG link dictionaries for the aforementioned linkage types identified in this analysis. This will facilitate the identification of such linkage types in future modelling efforts, whilst concurrently easing the process involved in their application. The need for a universal standard for maintaining link records corresponding to covalent linkages, and references to the associated dictionaries used during modelling and refinement, following deposition to the PDB is emphasized. The importance of correctly modelling covalent linkages is demonstrated using a case study, which involves the covalent linkage of an inhibitor to the main protease in various viral species, including SARS‐CoV‐2. This example demonstrates the importance of properly modelling covalent linkages using a comprehensive restraint dictionary, as opposed to just using a single interatomic distance restraint or failing to model the covalent linkage at all. Analysis of the Protein Data Bank revealed that over a third of entries contain covalent linkages without descriptions in the CCP4 Monomer Library (CCP4‐ML). The CCP4‐ML was updated with AceDRG dictionaries corresponding to commonly occurring classes of missing linkages.
Author Fischer, Marcus
Emsley, Paul
Long, Fei
Wojdyr, Marcin
Murshudov, Garib N.
Joosten, Robbie P.
Nicholls, Robert A.
Catapano, Lucrezia
Author_xml – sequence: 1
  givenname: Robert A.
  surname: Nicholls
  fullname: Nicholls, Robert A.
  email: nicholls@mrc-lmb.cam.ac.uk
  organization: Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, CambridgeCB2 0QH, United Kingdom
– sequence: 2
  givenname: Marcin
  surname: Wojdyr
  fullname: Wojdyr, Marcin
  organization: Global Phasing Limited, Sheraton House, Castle Park, CambridgeCB3 0AX, United Kingdom
– sequence: 3
  givenname: Robbie P.
  surname: Joosten
  fullname: Joosten, Robbie P.
  organization: Oncode Institute, The Netherlands
– sequence: 4
  givenname: Lucrezia
  surname: Catapano
  fullname: Catapano, Lucrezia
  organization: Randall Centre for Cell and Molecular Biophysics, Faculty of Life Sciences and Medicine, King's College London, LondonSE1 9RT, United Kingdom
– sequence: 5
  givenname: Fei
  surname: Long
  fullname: Long, Fei
  organization: Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, CambridgeCB2 0QH, United Kingdom
– sequence: 6
  givenname: Marcus
  surname: Fischer
  fullname: Fischer, Marcus
  organization: Chemical Biology and Therapeutics and Structural Biology, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN38105-3678, USA
– sequence: 7
  givenname: Paul
  surname: Emsley
  fullname: Emsley, Paul
  organization: Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, CambridgeCB2 0QH, United Kingdom
– sequence: 8
  givenname: Garib N.
  surname: Murshudov
  fullname: Murshudov, Garib N.
  email: garib@mrc-lmb.cam.ac.uk
  organization: Structural Studies, MRC Laboratory of Molecular Biology, Francis Crick Avenue, CambridgeCB2 0QH, United Kingdom
BookMark eNplUEtLw0AYXKSCtfYHeAt4jn773ngr9VUoCFoPnpbN5mtNTTc1myj99zbWg-BpHgwzMKdkEOqAhJxTuKQU9NUzA5npzHBGAXjGxREZ9lbae4M__ISMY1wDAFVcUy6GxCzeMNmUMZZhlVRleL9OfP3pKgztj3QrjEkZktg2nW-7xlXJpi6wimfkeOmqiONfHJGXu9vF9CGdP97PppN56oUULM2A584zxkAog8KzQphcOQGuyI2hhc61Z0KhEt5Q7xl1GoUqlkZ4BIfAR-Ti0Ltt6o8OY2vXddeE_aRlkkttpNRyn8oOqa-ywp3dNuXGNTtLwfYH2X8H2cnrDZs9SWCMfwOeI1tY
CitedBy_id crossref_primary_10_1107_S2059798321007610
crossref_primary_10_1107_S2059798322001103
crossref_primary_10_1107_S2059798323008793
crossref_primary_10_1128_aac_00909_24
crossref_primary_10_1107_S2059798321001753
crossref_primary_10_1038_s41592_024_02321_7
crossref_primary_10_1107_S2059798321009475
crossref_primary_10_1107_S2059798323003595
crossref_primary_10_21105_joss_04200
crossref_primary_10_1107_S2059798321011700
crossref_primary_10_1107_S2059798324003152
crossref_primary_10_1107_S2059798323002413
ContentType Journal Article
Copyright 2021 Nicholls et al. published by IUCr Journals.
2021. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2021 Nicholls et al. published by IUCr Journals.
– notice: 2021. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 24P
7QP
7SP
7SR
7TK
7U5
8BQ
8FD
H8D
JG9
L7M
DOI 10.1107/S2059798321003934
DatabaseName Wiley-Blackwell Open Access Titles
Calcium & Calcified Tissue Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Neurosciences Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
Aerospace Database
Materials Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle Materials Research Database
Aerospace Database
Engineered Materials Abstracts
Technology Research Database
Electronics & Communications Abstracts
Solid State and Superconductivity Abstracts
Calcium & Calcified Tissue Abstracts
Neurosciences Abstracts
Advanced Technologies Database with Aerospace
METADEX
DatabaseTitleList Materials Research Database

Database_xml – sequence: 1
  dbid: 24P
  name: Wiley-Blackwell Open Access Titles
  url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Anatomy & Physiology
Chemistry
EISSN 2059-7983
1399-0047
EndPage 745
ExternalDocumentID AYD2IR5022
Genre article
GroupedDBID 1OC
24P
53G
5VS
AAEVG
AAHHS
AAHQN
AAMNL
AANLZ
AAXRX
AAYCA
ABCUV
ACAHQ
ACCFJ
ACCZN
ACGFS
ACPOU
ACXBN
ACXQS
ADEOM
ADKYN
ADMGS
ADOZA
ADXAS
ADZMN
ADZOD
AEEZP
AEIGN
AEQDE
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFWVQ
AFZJQ
AHBTC
AITYG
AIURR
AIWBW
AJBDE
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMYDB
BFHJK
BMXJE
DCZOG
DRFUL
DRSTM
HGLYW
LATKE
LEEKS
LITHE
LOXES
LUTES
LYRES
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
RCJ
SUPJJ
WIH
WIK
---
-ET
-~X
.3N
.GA
05W
0R~
10A
23M
2WC
33P
36B
3SF
4.4
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
5GY
5HH
5LA
66C
6J9
702
7PT
7QP
7SP
7SR
7TK
7U5
8-0
8-1
8-3
8-4
8-5
8BQ
8FD
8UM
930
A03
AAESR
AAONW
AAZKR
ABCQN
ABDBF
ABPVW
ACGFO
ACNCT
ACUHS
ADBBV
ADIZJ
AEIMD
AEYWJ
AFEBI
AJXKR
ALAGY
AMBMR
ATUGU
AUFTA
AZBYB
AZVAB
BAFTC
BHBCM
BMNLL
BNHUX
BROTX
BRXPI
BY8
CS3
D-E
D-F
DPXWK
DR2
EBC
EBS
EMB
F00
F01
F04
F5P
G-S
G.N
GODZA
GX1
H.T
H.X
H8D
HZI
HZ~
IHE
IX1
J0M
JG9
K48
L7M
LC2
LC3
LP6
LP7
MK4
N04
N05
N9A
NF~
O66
P2P
P2W
P2X
P4D
Q.N
Q11
QB0
R.K
RNS
ROL
RX1
SJN
TN5
TUS
UB1
UPT
V2E
V8K
W8V
W99
WBFHL
WBKPD
WOHZO
WQJ
WYISQ
XG1
ZZTAW
~02
~IA
~WT
ID FETCH-LOGICAL-c4542-903bac2220468e4c2d48b6a40adb881d7b7c246e64c81cc21a7e46df84ce0ae03
IEDL.DBID 24P
ISSN 2059-7983
0907-4449
IngestDate Mon Jun 30 09:51:32 EDT 2025
Wed Jan 22 16:28:37 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License Attribution
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4542-903bac2220468e4c2d48b6a40adb881d7b7c246e64c81cc21a7e46df84ce0ae03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://onlinelibrary.wiley.com/doi/abs/10.1107%2FS2059798321003934
PQID 2535785575
PQPubID 1036389
PageCount 19
ParticipantIDs proquest_journals_2535785575
wiley_primary_10_1107_S2059798321003934_AYD2IR5022
PublicationCentury 2000
PublicationDate June 2021
PublicationDateYYYYMMDD 2021-06-01
PublicationDate_xml – month: 06
  year: 2021
  text: June 2021
PublicationDecade 2020
PublicationPlace 5 Abbey Square, Chester, Cheshire CH1 2HU, England
PublicationPlace_xml – name: 5 Abbey Square, Chester, Cheshire CH1 2HU, England
– name: Chester
PublicationTitle Acta crystallographica. Section D, Biological crystallography.
PublicationYear 2021
Publisher International Union of Crystallography
Wiley Subscription Services, Inc
Publisher_xml – name: International Union of Crystallography
– name: Wiley Subscription Services, Inc
SSID ssj0001637134
ssj0002237
Score 2.422959
Snippet Covalent linkages between constituent blocks of macromolecules and ligands have been subject to inconsistent treatment during the model‐building, refinement...
SourceID proquest
wiley
SourceType Aggregation Database
Publisher
StartPage 727
SubjectTerms AceDRG
CCP4 Monomer Library
Constraints
Covalence
covalent linkage
Deposition
Dictionaries
Interatomic distance
Linkages
Macromolecules
Modelling
Proteinase inhibitors
restraint dictionary
SARS‐CoV‐2
Severe acute respiratory syndrome coronavirus 2
Structural models
Title The missing link: covalent linkages in structural models
URI https://onlinelibrary.wiley.com/doi/abs/10.1107%2FS2059798321003934
https://www.proquest.com/docview/2535785575
Volume 77
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEB5qvehBfGK1yh68BpPNJLvxVlpLFRSpFuop7CvgJRXi_8fZTaqV3rwEEpg9zOzufDOT-QbgJrHkZayvD4qqihAteg7IIqKPcaqc4JX1jcJPz_lsgY_LbNmD8boXpuWH-Em4-ZMR7mt_wJXuppCEuv4rJ2QgCj9pJzSY4g7s-hZbT6DP8eU30ZKnoq0ue4HIS3TVTVrndmuVP0hzE68GhzM9hIMOKbJRa9oj6Ln6GPY3-ANPQJKRGRnKx_vMl2LvmFnR1iFHEl7prmjYR81aklhPsMHC5JvmFBbT-7fxLOpGIUQGM-SkulQrQ76cwlnp0HCLUucKY2W1JMgptDAcc5ejkYkxPFHCYW4ricbFysXpGfTrVe3OgcXG6lQXicoKhTrNtCwUBRmVsxQKOZkPYLhWQNnt56bkWWDFIWw3AB6UUn62bBhliCJiUW5pshy9T_jDPCOUcPEfoUvY4_4HkpDyGEKfVOWuCAF86etgYXpO5vwbl1ulhA
linkProvider Wiley-Blackwell
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDI7GOAAHxFMMBuTAtaJN3STlNjGmDbYJwSaNU5VXJS4d0vj_wkkLDO3GsZGcgx3Hn-36CyE3icUoY31_UJRlBGDBc0DmES7GqXKCldYPCk-mfDiHx0W2aJH-9yxMzQ_xU3DznhHua-_gviBde3lo7L8yhAYi90_thAlT2CLbwJnw7sng-bfSwlNRt5e9QOQlmvYm7nO7scsfqLkOWEPEGRyQ_QYq0l5t20PSctUR2VsjEDwmEq1M0VI-4ae-F3tHzRLPDkaS8ImXxYq-V7RmifUMGzQ8fbM6IfPBw-x-GDVvIUQGMmCou1Qrg8Ec81npwDALUnMFsbJaIuYUWhgG3HEwMjGGJUo44LaUYFysXJyekna1rNwZobGxOtV5orJcgU4zLXOFWUbpLOZCTvIO6X4roGgO9KpgWaDFQXDXISwopfio6TCKkEbEotjQZNF767PRS4Yw4fw_QtdkZzibjIvxaPp0QXaZ_5sk1D-6pI1qc5cIBz71VbD2F7-vqAc
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDLbGkBAcEE8xGNAD14o2dZuU27QxbTymCZg0TlVelbh0k8b_F07awRA3jo3kHOw4_mzXXwBuYkNRxrj-IC_LENGg44DMQ1qMEmk5K40bFH6eZKMZPszTeQv661mYmh_iu-DmPMPf187Bl6asndz39V8ZIQOeu5d2_IApbsG2b_o5emec_hRasoTX3WUnEDqJprtJ-9z-2eUX0tzEqz7gDA9gv0GKQa827SG0bHUEexv8gccgyMgBGcrl-4Frxd4FekFHhwKJ_6S7YhV8VEFNEusINgL_8s3qBGbD-7f-KGyeQgg1pshIdYmSmmI5pbPComYGhcokRtIoQZCTK64ZZjZDLWKtWSy5xcyUArWNpI2SU2hXi8qeQRBpoxKVxzLNJaokVSKXlGSU1lAqZEXWge5aAUVznlcFSz0rDmG7DjCvlGJZs2EUPouIePFHk0XvfcDGLymhhPP_CF3DznQwLJ7Gk8cL2GXuXxJf_ehCm7RmLwkMfKorb-wvBoWnOQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=The+missing+link%3A+covalent+linkages+in+structural+models&rft.jtitle=Acta+crystallographica.+Section+D%2C+Structural+biology&rft.au=Nicholls%2C+Robert+A.&rft.au=Wojdyr%2C+Marcin&rft.au=Joosten%2C+Robbie+P.&rft.au=Catapano%2C+Lucrezia&rft.date=2021-06-01&rft.pub=International+Union+of+Crystallography&rft.issn=2059-7983&rft.eissn=2059-7983&rft.volume=77&rft.issue=6&rft.spage=727&rft.epage=745&rft_id=info:doi/10.1107%2FS2059798321003934&rft.externalDBID=10.1107%252FS2059798321003934&rft.externalDocID=AYD2IR5022
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2059-7983&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2059-7983&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2059-7983&client=summon