Deep Convolutional Neural Networks Help Scoring Tandem Mass Spectrometry Data in Database-Searching Approaches

Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring...

Full description

Saved in:
Bibliographic Details
Published inJournal of proteome research Vol. 20; no. 10; pp. 4708 - 4717
Main Authors Kudriavtseva, Polina, Kashkinov, Matvey, Kertész-Farkas, Attila
Format Journal Article
LanguageEnglish
Published American Chemical Society 01.10.2021
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring mass spectrometry (MS)/MS spectra to increase the number of spectrum annotations with high confidence. Experimental results using publicly available data sets show that Slider can annotate slightly more spectra than the state-of-the-art methods (BoltzMatch, Res-EV, Prosit), albeit 2–10 times faster. More interestingly, Slider provides only 2–4% fewer spectrum annotations with low-resolution fragmentation information than other methods with high-resolution information. This means that Slider can exploit nearly as much information from the context of low-resolution spectrum peaks as the high-resolution fragmentation information can provide for other scoring methods. Thus, Slider can be an optimal choice for practitioners using old spectrometers with low-resolution detectors.
AbstractList Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the spectrometers. We present a deep convolutional neural network, called Slider, which learns an optimal feature extraction in its kernels for scoring mass spectrometry (MS)/MS spectra to increase the number of spectrum annotations with high confidence. Experimental results using publicly available data sets show that Slider can annotate slightly more spectra than the state-of-the-art methods (BoltzMatch, Res-EV, Prosit), albeit 2–10 times faster. More interestingly, Slider provides only 2–4% fewer spectrum annotations with low-resolution fragmentation information than other methods with high-resolution information. This means that Slider can exploit nearly as much information from the context of low-resolution spectrum peaks as the high-resolution fragmentation information can provide for other scoring methods. Thus, Slider can be an optimal choice for practitioners using old spectrometers with low-resolution detectors.
Author Kashkinov, Matvey
Kudriavtseva, Polina
Kertész-Farkas, Attila
AuthorAffiliation Faculty of Computer Science, HSE University
AuthorAffiliation_xml – name: Faculty of Computer Science, HSE University
Author_xml – sequence: 1
  givenname: Polina
  surname: Kudriavtseva
  fullname: Kudriavtseva, Polina
– sequence: 2
  givenname: Matvey
  surname: Kashkinov
  fullname: Kashkinov, Matvey
  organization: Faculty of Computer Science, HSE University
– sequence: 3
  givenname: Attila
  orcidid: 0000-0001-8110-7253
  surname: Kertész-Farkas
  fullname: Kertész-Farkas, Attila
  email: akerteszfarkas@hse.ru
BookMark eNqFkE9PwkAQxTcGEwH9CCZ79FLcP13aHgmomKAewHMz7E6lWHbrbqvh21tAvXp6k8x78ya_AelZZ5GQa85GnAl-CzqMtrV3DbodjrhmTHJ1RvpcSRXJjCW93znN5AUZhLBljKuEyT6xM8SaTp39dFXblM5CRZ-x9Udpvpx_D3SOVU2X2vnSvtEVWIM7-gQh0GWNuvFdaeP3dAYN0NIedQ0BoyWC15tDZlJ334HeYLgk5wVUAa9-dEhe7-9W03m0eHl4nE4WEUiRNhHnmcG4AInSmDjjqRCSgQCjeBYX6dokmSnWWmidJinLkjHLOKBU45hpFMbIIbk53e2KP1oMTb4rg8aqAouuDblQ4zETaaxYZ1Unq_YuBI9FXvtyB36fc5Yf-OYd3_yPb_7Dt8vxU-64dq3v0IV_Mt9cQYak
CitedBy_id crossref_primary_10_1155_2022_3627831
crossref_primary_10_1016_j_aca_2023_341330
crossref_primary_10_1002_pmic_202300145
crossref_primary_10_1186_s13000_023_01380_2
Cites_doi 10.1021/pr101065j
10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
10.1371/journal.pcbi.1002296
10.1021/pr101196n
10.1021/pr5010983
10.1021/pr400394g
10.1021/pr8001244
10.1038/nmeth1019
10.1002/pmic.201900334
10.1021/acs.jproteome.6b00144
10.2174/157489312800604354
10.1038/nature14539
10.1021/acs.jproteome.5b00081
10.1021/acs.jproteome.8b00991
10.1073/pnas.1705691114
10.1093/bioinformatics/btaa206
10.1038/s41592-019-0427-6
10.1074/mcp.M113.032813
10.1021/pr501173s
10.1101/831776
10.1021/acs.analchem.7b02566
10.1021/ac0258709
10.1101/2020.11.12.380881
10.1021/acs.jproteome.8b00206
10.1021/ac00104a020
10.1016/j.cels.2017.05.009
10.1021/pr800420s
10.1038/nature01511
10.1021/pr800127y
10.1016/1044-0305(94)80016-2
10.1021/pr500202e
10.1073/pnas.1530509100
10.1038/nmeth1113
10.1021/ac025747h
10.1038/ncomms6277
10.1021/pr500741y
10.1016/S1359-6446(03)02978-7
10.1021/pr301024c
10.1038/s41592-019-0426-7
10.1074/mcp.O113.036327
10.1021/pr0499491
10.1007/s13361-015-1179-x
10.1021/acs.jproteome.6b00915
ContentType Journal Article
Copyright 2021 American Chemical Society
Copyright_xml – notice: 2021 American Chemical Society
DBID AAYXX
CITATION
7X8
DOI 10.1021/acs.jproteome.1c00315
DatabaseName CrossRef
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE - Academic
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
EISSN 1535-3907
EndPage 4717
ExternalDocumentID 10_1021_acs_jproteome_1c00315
d262652385
GroupedDBID -
4.4
53G
55A
5GY
5VS
7~N
AABXI
ABFRP
ABMVS
ABUCX
ACGFS
ACS
AEESW
AENEX
AFEFF
AHGAQ
ALMA_UNASSIGNED_HOLDINGS
AQSVZ
CS3
DU5
EBS
ED
ED~
F5P
GGK
GNL
IH9
IHE
JG
JG~
P2P
RNS
ROL
UI2
VF5
VG9
W1F
ZA5
---
AAHBH
AAYXX
ABJNI
ABQRX
ADHLV
BAANH
CITATION
CUPRZ
7X8
ID FETCH-LOGICAL-a328t-119de4fa3e3dd49182230a2ad5194f8bd79dfbc2cc8780976091ae35640ce2dd3
IEDL.DBID ACS
ISSN 1535-3893
IngestDate Fri Aug 16 22:04:46 EDT 2024
Fri Aug 23 01:07:01 EDT 2024
Sun Oct 03 03:57:01 EDT 2021
IsPeerReviewed true
IsScholarly true
Issue 10
Keywords deep learning
PSM scoring
fast
tandem mass spectrometry
convolutional neural networks
spectrum annotation
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a328t-119de4fa3e3dd49182230a2ad5194f8bd79dfbc2cc8780976091ae35640ce2dd3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ORCID 0000-0001-8110-7253
PQID 2566028450
PQPubID 23479
PageCount 10
ParticipantIDs proquest_miscellaneous_2566028450
crossref_primary_10_1021_acs_jproteome_1c00315
acs_journals_10_1021_acs_jproteome_1c00315
ProviderPackageCode JG~
55A
AABXI
GNL
VF5
7~N
VG9
GGK
W1F
ABFRP
ACS
AEESW
AFEFF
ABMVS
ABUCX
IH9
AQSVZ
ED~
UI2
PublicationCentury 2000
PublicationDate 20211001
2021-10-01
PublicationDateYYYYMMDD 2021-10-01
PublicationDate_xml – month: 10
  year: 2021
  text: 20211001
  day: 01
PublicationDecade 2020
PublicationTitle Journal of proteome research
PublicationTitleAlternate J. Proteome Res
PublicationYear 2021
Publisher American Chemical Society
Publisher_xml – name: American Chemical Society
References ref9/cit9
ref45/cit45
ref6/cit6
ref36/cit36
ref3/cit3
ref27/cit27
ref18/cit18
ref11/cit11
ref25/cit25
ref16/cit16
ref29/cit29
ref32/cit32
ref23/cit23
ref39/cit39
ref14/cit14
ref8/cit8
ref5/cit5
ref31/cit31
ref2/cit2
ref43/cit43
ref34/cit34
ref37/cit37
ref28/cit28
ref40/cit40
ref20/cit20
ref17/cit17
ref10/cit10
ref26/cit26
ref35/cit35
ref19/cit19
ref21/cit21
ref12/cit12
ref15/cit15
ref42/cit42
ref46/cit46
ref41/cit41
ref22/cit22
ref13/cit13
ref33/cit33
ref4/cit4
ref30/cit30
ref47/cit47
ref1/cit1
ref24/cit24
ref38/cit38
ref44/cit44
ref7/cit7
References_xml – ident: ref43/cit43
– ident: ref7/cit7
  doi: 10.1021/pr101065j
– ident: ref14/cit14
  doi: 10.1002/(SICI)1522-2683(19991201)20:18<3551::AID-ELPS3551>3.0.CO;2-2
– ident: ref4/cit4
  doi: 10.1371/journal.pcbi.1002296
– ident: ref31/cit31
– ident: ref11/cit11
  doi: 10.1021/pr101196n
– ident: ref19/cit19
  doi: 10.1021/pr5010983
– ident: ref37/cit37
  doi: 10.1021/pr400394g
– ident: ref18/cit18
  doi: 10.1021/pr8001244
– ident: ref41/cit41
  doi: 10.1038/nmeth1019
– ident: ref32/cit32
– ident: ref26/cit26
  doi: 10.1002/pmic.201900334
– ident: ref42/cit42
  doi: 10.1021/acs.jproteome.6b00144
– ident: ref3/cit3
  doi: 10.2174/157489312800604354
– ident: ref30/cit30
  doi: 10.1038/nature14539
– ident: ref21/cit21
  doi: 10.1021/acs.jproteome.5b00081
– ident: ref45/cit45
  doi: 10.1021/acs.jproteome.8b00991
– ident: ref28/cit28
  doi: 10.1073/pnas.1705691114
– ident: ref29/cit29
  doi: 10.1093/bioinformatics/btaa206
– ident: ref47/cit47
  doi: 10.1038/s41592-019-0427-6
– ident: ref38/cit38
  doi: 10.1074/mcp.M113.032813
– ident: ref20/cit20
  doi: 10.1021/pr501173s
– ident: ref13/cit13
  doi: 10.1101/831776
– ident: ref46/cit46
  doi: 10.1021/acs.analchem.7b02566
– ident: ref15/cit15
  doi: 10.1021/ac0258709
– ident: ref25/cit25
  doi: 10.1101/2020.11.12.380881
– ident: ref34/cit34
  doi: 10.1021/acs.jproteome.8b00206
– ident: ref8/cit8
  doi: 10.1021/ac00104a020
– ident: ref36/cit36
  doi: 10.1016/j.cels.2017.05.009
– ident: ref10/cit10
  doi: 10.1021/pr800420s
– ident: ref1/cit1
  doi: 10.1038/nature01511
– ident: ref9/cit9
  doi: 10.1021/pr800127y
– ident: ref5/cit5
  doi: 10.1016/1044-0305(94)80016-2
– ident: ref17/cit17
  doi: 10.1021/pr500202e
– ident: ref44/cit44
  doi: 10.1073/pnas.1530509100
– ident: ref27/cit27
– ident: ref22/cit22
  doi: 10.1038/nmeth1113
– ident: ref24/cit24
  doi: 10.1021/ac025747h
– ident: ref6/cit6
  doi: 10.1038/ncomms6277
– ident: ref39/cit39
  doi: 10.1021/pr500741y
– ident: ref2/cit2
  doi: 10.1016/S1359-6446(03)02978-7
– ident: ref16/cit16
  doi: 10.1021/pr301024c
– ident: ref23/cit23
  doi: 10.1038/s41592-019-0426-7
– ident: ref33/cit33
  doi: 10.1074/mcp.O113.036327
– ident: ref40/cit40
  doi: 10.1021/pr0499491
– ident: ref12/cit12
  doi: 10.1007/s13361-015-1179-x
– ident: ref35/cit35
  doi: 10.1021/acs.jproteome.6b00915
SSID ssj0015703
Score 2.4443421
Snippet Spectrum annotation is a challenging task due to the presence of unexpected peptide fragmentation ions as well as the inaccuracy of the detectors of the...
SourceID proquest
crossref
acs
SourceType Aggregation Database
Publisher
StartPage 4708
Title Deep Convolutional Neural Networks Help Scoring Tandem Mass Spectrometry Data in Database-Searching Approaches
URI http://dx.doi.org/10.1021/acs.jproteome.1c00315
https://search.proquest.com/docview/2566028450
Volume 20
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEA7retCLb3F9EcGTkNqmybY9LtVlEVYEFbyVvAoq2y6268Ff7yRthUVEPQVCE9rJJPOl8-ULQudQH_lcRCTRkhNGfUFiwBFkmAsjuEgkc2z36e1w8shunvhTD13-kMGnwaVQlffiRAvKmfECZd2Qr6BVGsEEsVgovf9KG1g5qUYglRMbibsjOz91Y0OSqpZD0vKK7MLMeBPddYd1GnbJq7eopac-vms3_vULttBGCznxqPGRbdQzxQ5aS7ub3nZRcWXMHKdl8d76ITxtRTtc4VjiFYbwNMf3ytH18IP98zzDUwDe2F5gX1vNA-gKX4la4OfClTY-kobNbNuMWvFyU-2hx_H1Qzoh7T0MRIQ0rkkQJNqwXIQm1JolsCOBfYugQgP6Y3ksdZToXCqqVBzFPuAbwCDChHzIfGWo1uE-6hdlYQ4Q5jJRRgIkY8JKvXFpOKwakubMF0kcsQG6AGtl7TyqMpcip0HmKjsTZq0JB8jrxi2bN9ocvzU460Y3AxPb1IgoTLmoMgB-Q0BajPuH_3mDI7ROLcfFkfuOUb9-W5gTACm1PHWO-QmubOXY
link.rule.ids 315,783,787,2772,27088,27936,27937,57066,57116
linkProvider American Chemical Society
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwEB2xHODCjiirkTghuWSxsxxRoSpLuVAEt8hbJEBNK5Jy4OsZuwkIJIR6smTFljOe-D1nxs8AJ1gfe1zENNWSUxZ4gibII2iUCyO4SCVz2e79u6j3wK6f-NMcRM1ZGBxEiT2VLoj_rS7gn9m6F6ddMBqatq-sN_J5WOQxgqalRJ37r-iBVZWa6qRyagG5ObnzVzcWmVT5E5l-LswObbqr8Pg1Tpdk8tqeVLKtPn5JOM7-ImuwUhNQcj71mHWYM8UGLHWae982obgwZkw6o-K99kp82kp4uMLljJcEwWpM7pVL3iMD-x96SPpIw4m9zr6yCgjYFbkQlSDPhSstWtJpbrNtc15LmZtyCx66l4NOj9a3MlARBklFfT_VhuUiNKHWLMX9Ce5iRCA0ckGWJ1LHqc6lCpRK4sRDtoOMRJiQR8xTJtA63IaFYlSYHSBcpspIJGhMWOE3Lg3HNUQGOfNEmsSsBadoraz-qsrMBcwDP3OVjQmz2oQtaDfTl42nSh3_NThuJjlDE9tAiSjMaFJmSAMj5F2Me7uzjOAIlnqD_m12e3V3swfLgc1-cWl_-7BQvU3MAdKXSh46X_0ElsfuPQ
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1bS8MwFA46QX3xLs5rBJ-EzF6StX0cm2NeNoRtMPCh5FZQWTds54O_3pOsHUwQ0adCaEJ6cpLzpefLF4SuoDxwGA9IpAQj1HM4CQFHkHrCNWc8EtSy3bu9emdI70dsVLAqzVkY6EQGLWU2iW9m9VQlhcKAe2PKX61-wWSsa640HslW0RoLXJuibTT7iwyCUZaaa6UyYoJyeXrnp2ZMdJLZcnRaXpxtxGlvo-dFXy3R5K02y0VNfn6Tcfzfx-ygrQKI4sbcc3bRik730EazvP9tH6Utrae4OUk_Cu-Et42Uh31Y7niGIWhNcV9aEh8emP_RY9wFOI7Ntfa5UUKApnCL5xy_pPZpoiaZc5xNnUYhaa6zAzRs3w6aHVLczkC474U5cd1IaZpwX_tK0Qj2KbCb4R5XgAlpEgoVRCoR0pMyDEIHUA8gE659VqeO1J5S_iGqpJNUHyHMRCS1AKBGuRGAY0IzWEuEl1CHR2FAq-garBUXsyuLbeLcc2NbWJowLkxYRbVyCOPpXLHjtwqX5UDHYGKTMOGpnsyyGOBgHfAXZc7xX3pwgdafWu348a73cII2PUOCsey_U1TJ32f6DFBMLs6tu34BJ8_wtw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+Convolutional+Neural+Networks+Help+Scoring+Tandem+Mass+Spectrometry+Data+in+Database-Searching+Approaches&rft.jtitle=Journal+of+proteome+research&rft.au=Kudriavtseva%2C+Polina&rft.au=Kashkinov%2C+Matvey&rft.au=Kert%C3%A9sz-Farkas%2C+Attila&rft.date=2021-10-01&rft.eissn=1535-3907&rft.volume=20&rft.issue=10&rft.spage=4708&rft.epage=4717&rft_id=info:doi/10.1021%2Facs.jproteome.1c00315&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1535-3893&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1535-3893&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1535-3893&client=summon