scDM: A deep generative method for cell surface protein prediction with diffusion model

[Display omitted] •Considering both RNA and protein expression could obtain more biological evidence.•We propose a method for predicting protein expression based on the diffusion model.•The diffusion model is the first to be used in the field of single-cell analysis.•The proposed method was validate...

Full description

Saved in:
Bibliographic Details
Published inJournal of molecular biology Vol. 436; no. 12; p. 168610
Main Authors Yu, Hanlei, Zheng, Yuanjie, Yang, Xinbo
Format Journal Article
LanguageEnglish
Published Netherlands Elsevier Ltd 15.06.2024
Subjects
Online AccessGet full text
ISSN0022-2836
1089-8638
1089-8638
DOI10.1016/j.jmb.2024.168610

Cover

Loading…
Abstract [Display omitted] •Considering both RNA and protein expression could obtain more biological evidence.•We propose a method for predicting protein expression based on the diffusion model.•The diffusion model is the first to be used in the field of single-cell analysis.•The proposed method was validated on three single-cell sequencing datasets.•Our results provide new directions for the identification of novel drug targets. The executors of organismal functions are proteins, and the transition from RNA to protein is subject to post-transcriptional regulation; therefore, considering both RNA and surface protein expression simultaneously can provide additional evidence of biological processes. Cellular indexing of transcriptomes and epitopes by sequencing (CITE-Seq) technology can measure both RNA and protein expression in single cells, but these experiments are expensive and time-consuming. Due to the lack of computational tools for predicting surface proteins, we used datasets obtained with CITE-seq technology to design a deep generative prediction method based on diffusion models and to find biological discoveries through the prediction results. In our method, the scDM, which predicts protein expression values from RNA expression values of individual cells, uses a novel way of encoding the data into a model and generates predicted samples by introducing Gaussian noise to gradually remove the noise to learn the data distribution during the modelling process. Comprehensive evaluation across different datasets demonstrated that our predictions yielded satisfactory results and further demonstrated the effectiveness of incorporating information from single-cell multiomics data into diffusion models for biological studies. We also found that new directions for discovering therapeutic drug targets could be provided by jointly analysing the predictive value of surface protein expression and cancer cell drug scores.
AbstractList The executors of organismal functions are proteins, and the transition from RNA to protein is subject to post-transcriptional regulation; therefore, considering both RNA and surface protein expression simultaneously can provide additional evidence of biological processes. Cellular indexing of transcriptomes and epitopes by sequencing (CITE-Seq) technology can measure both RNA and protein expression in single cells, but these experiments are expensive and time-consuming. Due to the lack of computational tools for predicting surface proteins, we used datasets obtained with CITE-seq technology to design a deep generative prediction method based on diffusion models and to find biological discoveries through the prediction results. In our method, the scDM, which predicts protein expression values from RNA expression values of individual cells, uses a novel way of encoding the data into a model and generates predicted samples by introducing Gaussian noise to gradually remove the noise to learn the data distribution during the modelling process. Comprehensive evaluation across different datasets demonstrated that our predictions yielded satisfactory results and further demonstrated the effectiveness of incorporating information from single-cell multiomics data into diffusion models for biological studies. We also found that new directions for discovering therapeutic drug targets could be provided by jointly analysing the predictive value of surface protein expression and cancer cell drug scores.
The executors of organismal functions are proteins, and the transition from RNA to protein is subject to post-transcriptional regulation; therefore, considering both RNA and surface protein expression simultaneously can provide additional evidence of biological processes. Cellular indexing of transcriptomes and epitopes by sequencing (CITE-Seq) technology can measure both RNA and protein expression in single cells, but these experiments are expensive and time-consuming. Due to the lack of computational tools for predicting surface proteins, we used datasets obtained with CITE-seq technology to design a deep generative prediction method based on diffusion models and to find biological discoveries through the prediction results. In our method, the scDM, which predicts protein expression values from RNA expression values of individual cells, uses a novel way of encoding the data into a model and generates predicted samples by introducing Gaussian noise to gradually remove the noise to learn the data distribution during the modelling process. Comprehensive evaluation across different datasets demonstrated that our predictions yielded satisfactory results and further demonstrated the effectiveness of incorporating information from single-cell multiomics data into diffusion models for biological studies. We also found that new directions for discovering therapeutic drug targets could be provided by jointly analysing the predictive value of surface protein expression and cancer cell drug scores.The executors of organismal functions are proteins, and the transition from RNA to protein is subject to post-transcriptional regulation; therefore, considering both RNA and surface protein expression simultaneously can provide additional evidence of biological processes. Cellular indexing of transcriptomes and epitopes by sequencing (CITE-Seq) technology can measure both RNA and protein expression in single cells, but these experiments are expensive and time-consuming. Due to the lack of computational tools for predicting surface proteins, we used datasets obtained with CITE-seq technology to design a deep generative prediction method based on diffusion models and to find biological discoveries through the prediction results. In our method, the scDM, which predicts protein expression values from RNA expression values of individual cells, uses a novel way of encoding the data into a model and generates predicted samples by introducing Gaussian noise to gradually remove the noise to learn the data distribution during the modelling process. Comprehensive evaluation across different datasets demonstrated that our predictions yielded satisfactory results and further demonstrated the effectiveness of incorporating information from single-cell multiomics data into diffusion models for biological studies. We also found that new directions for discovering therapeutic drug targets could be provided by jointly analysing the predictive value of surface protein expression and cancer cell drug scores.
[Display omitted] •Considering both RNA and protein expression could obtain more biological evidence.•We propose a method for predicting protein expression based on the diffusion model.•The diffusion model is the first to be used in the field of single-cell analysis.•The proposed method was validated on three single-cell sequencing datasets.•Our results provide new directions for the identification of novel drug targets. The executors of organismal functions are proteins, and the transition from RNA to protein is subject to post-transcriptional regulation; therefore, considering both RNA and surface protein expression simultaneously can provide additional evidence of biological processes. Cellular indexing of transcriptomes and epitopes by sequencing (CITE-Seq) technology can measure both RNA and protein expression in single cells, but these experiments are expensive and time-consuming. Due to the lack of computational tools for predicting surface proteins, we used datasets obtained with CITE-seq technology to design a deep generative prediction method based on diffusion models and to find biological discoveries through the prediction results. In our method, the scDM, which predicts protein expression values from RNA expression values of individual cells, uses a novel way of encoding the data into a model and generates predicted samples by introducing Gaussian noise to gradually remove the noise to learn the data distribution during the modelling process. Comprehensive evaluation across different datasets demonstrated that our predictions yielded satisfactory results and further demonstrated the effectiveness of incorporating information from single-cell multiomics data into diffusion models for biological studies. We also found that new directions for discovering therapeutic drug targets could be provided by jointly analysing the predictive value of surface protein expression and cancer cell drug scores.
ArticleNumber 168610
Author Yang, Xinbo
Zheng, Yuanjie
Yu, Hanlei
Author_xml – sequence: 1
  givenname: Hanlei
  surname: Yu
  fullname: Yu, Hanlei
– sequence: 2
  givenname: Yuanjie
  surname: Zheng
  fullname: Zheng, Yuanjie
  email: yjzheng@sdnu.edu.cn
– sequence: 3
  givenname: Xinbo
  surname: Yang
  fullname: Yang, Xinbo
BackLink https://www.ncbi.nlm.nih.gov/pubmed/38754773$$D View this record in MEDLINE/PubMed
BookMark eNqFkTtvFDEUhS0URDaBH0CDXNLM4sf4sVBFgQSkIBoQpWV7rolXM-PF9gTx7_GwSUMRqivL5xzde74zdDKnGRB6ScmWEirf7Lf7yW0ZYf2WSi0peYI2lOhdpyXXJ2hDCGMd01yeorNS9oQQwXv9DJ1yrUSvFN-g78W___wWX-AB4IB_wAzZ1ngHeIJ6mwYcUsYexhGXJQfrAR9yqhDnNmGIvsY041-x3uIhhrCU9TmlAcbn6GmwY4EX9_Mcfbv68PXyY3fz5frT5cVN57mWtbNWOaWVcpJIIAE8d7onzAIhNnBvhVAyONaOFTvq2h8LjBMnHGUwyJ3i5-j1Mbft9XOBUs0Uy7qwnSEtxXAquNK94OL_UiKklHr3N_XVvXRxEwzmkONk82_z0FsTqKPA51RKhmB8rHZto2YbR0OJWQmZvWmEzErIHAk1J_3H-RD-mOfd0QOtybsI2RQfYfaNQAZfzZDiI-4_zNOnSg
CitedBy_id crossref_primary_10_1038_s41540_024_00484_9
Cites_doi 10.1038/nmeth.3742
10.1093/nar/gkac1042
10.1038/s42256-022-00518-z
10.1109/ICCV.2015.169
10.1158/1535-7163.MCT-10-0106
10.1016/j.jmb.2023.168121
10.1038/nbt.3129
10.1093/bib/bbad313
10.1093/bib/bbad081
10.1038/s41592-020-01050-x
10.1038/s41467-022-35094-8
10.1158/0008-5472.CAN-12-3342
10.1038/s41576-019-0093-7
10.1146/annurev-genom-091416-035324
10.1038/s41467-020-14391-0
10.1038/nmeth.3370
10.1038/nrc2294
10.1016/j.ymeth.2020.10.001
10.1016/j.cell.2019.05.034
10.1038/s42256-022-00545-w
10.1126/science.aan6826
10.1038/s42256-022-00534-z
10.1016/j.jmb.2009.03.002
10.1093/bib/bbad266
10.7554/eLife.23203
10.1371/journal.pone.0202753
10.1038/s41467-022-35692-6
10.1016/j.cell.2021.04.048
10.1038/nmeth.4380
10.1093/bioinformatics/btad596
10.1186/s13059-016-0950-z
10.1038/s41467-022-33758-z
10.15252/msb.20167144
10.1038/nprot.2014.006
10.1038/nbt.3973
10.1038/nmeth.1315
10.1038/s41467-022-35031-9
ContentType Journal Article
Copyright 2024 Elsevier Ltd
Copyright © 2024 Elsevier Ltd. All rights reserved.
Copyright_xml – notice: 2024 Elsevier Ltd
– notice: Copyright © 2024 Elsevier Ltd. All rights reserved.
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7X8
7S9
L.6
DOI 10.1016/j.jmb.2024.168610
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
DatabaseTitleList AGRICOLA
MEDLINE - Academic
MEDLINE

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
Biology
EISSN 1089-8638
ExternalDocumentID 38754773
10_1016_j_jmb_2024_168610
S0022283624002055
Genre Journal Article
GroupedDBID ---
--K
--M
-DZ
-ET
-~X
.~1
0R~
1B1
1RT
1~.
1~5
4.4
457
4G.
53G
5GY
5RE
5VS
7-5
71M
85S
8P~
9JM
AAAJQ
AABNK
AACTN
AAEDT
AAEDW
AAHBH
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AARKO
AAXKI
AAXUO
ABFNM
ABFRF
ABGSF
ABJNI
ABLJU
ABMAC
ABOCM
ABPPZ
ABUDA
ACDAQ
ACGFO
ACGFS
ACNCT
ACRLP
ADBBV
ADEZE
ADUVX
ADVLN
AEBSH
AEFWE
AEHWI
AEKER
AENEX
AFFNX
AFJKZ
AFKWA
AFTJW
AFXIZ
AGEKW
AGUBO
AGYEJ
AHHHB
AIEXJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AXJTR
BKOJK
BLXMC
CJTIS
CS3
DM4
DU5
EBS
EFBJH
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GX1
HMG
IH2
IHE
J1W
KOM
LG5
LUGTX
LX2
LZ5
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
RNS
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPCBC
SSI
SSU
SSZ
T5K
TWZ
VQA
WH7
XPP
YQT
ZMT
ZU3
~G-
.55
.GJ
186
29L
3O-
AAQXK
AATTM
AAYWO
AAYXX
ABDPE
ABEFU
ABWVN
ABXDB
ACKIV
ACRPL
ACVFH
ADCNI
ADFGL
ADIYS
ADMUD
ADNMO
ADXHL
AEIPS
AEUPX
AFPUW
AGCQF
AGHFR
AGQPQ
AGRDE
AGRNS
AI.
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BNPGV
CAG
CITATION
COF
EJD
FEDTE
FGOYB
G-2
HLW
HVGLF
HX~
HZ~
H~9
K-O
MVM
NEJ
R2-
RIG
SBG
SIN
SSH
UQL
VH1
WUQ
X7M
XJT
XOL
Y6R
YYP
ZGI
ZKB
~KM
CGR
CUY
CVF
ECM
EIF
NPM
7X8
7S9
L.6
ID FETCH-LOGICAL-c386t-aa7b7877b606e0fec3b8402ae00af3ca5576fb2016591bb842f230b5b12ed6973
IEDL.DBID .~1
ISSN 0022-2836
1089-8638
IngestDate Wed Jul 02 03:17:41 EDT 2025
Fri Jul 11 08:37:39 EDT 2025
Wed Feb 19 02:05:49 EST 2025
Tue Jul 01 03:05:37 EDT 2025
Thu Apr 24 23:06:23 EDT 2025
Tue Dec 03 03:44:49 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 12
Keywords CITE-seq technology
deep learning
generating predictions
Gaussian noise
Technical noise
Language English
License Copyright © 2024 Elsevier Ltd. All rights reserved.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c386t-aa7b7877b606e0fec3b8402ae00af3ca5576fb2016591bb842f230b5b12ed6973
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PMID 38754773
PQID 3056668997
PQPubID 23479
ParticipantIDs proquest_miscellaneous_3153784535
proquest_miscellaneous_3056668997
pubmed_primary_38754773
crossref_citationtrail_10_1016_j_jmb_2024_168610
crossref_primary_10_1016_j_jmb_2024_168610
elsevier_sciencedirect_doi_10_1016_j_jmb_2024_168610
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2024-06-15
PublicationDateYYYYMMDD 2024-06-15
PublicationDate_xml – month: 06
  year: 2024
  text: 2024-06-15
  day: 15
PublicationDecade 2020
PublicationPlace Netherlands
PublicationPlace_xml – name: Netherlands
PublicationTitle Journal of molecular biology
PublicationTitleAlternate J Mol Biol
PublicationYear 2024
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Stoeckius, Hafemeister, Stephenson, Houck-Loomis, Chattopadhyay, Swerdlow, Satija, Smibert (b0070) 2017; 14
Cao, Gong, Hong, Wan (b0105) 2022; 13
Dhariwal, Nichol (b0205) 2021
Dey, Kester, Spanjaard, Bienko, Van Oudenaarden (b0025) 2015; 33
Peterson, Zhang, Kumar, Wong, Li, Wilson, Moore, McClanahan, Sadekova, Klappenbach (b0065) 2017; 35
Gao, Zhang, Liu, Li, Gao, Yu (b0020) 2023; 24
Lakkis, Schroeder, Su, Lee, Bashore, Reilly, Li (b0075) 2022; 4
Xiong, Tian, Li, Ning, Gao, Zhang (b0115) 2022; 13
Kalakoti, Peter, Gawande, Sundar (b0085) 2023; 435
Macaulay, Haerty, Kumar, Li, Hu, Teng, Goolam, Saurat, Coupland, Shirley (b0030) 2015; 12
Edfors, Danielsson, Hallström, Käll, Lundberg, Pontén, Forsström, Uhlén (b0080) 2016; 12
Yang, Wang, Wang, Fang, Tang, Huang, Lu, Yao (b0120) 2022; 4
Pott (b0045) 2017; 6
Elizaga, Li, Kochar, Wilson, Allen, Tieu, Frank, Sobieszczyk, Cohen, Sanchez (b0175) 2018; 13
Abaan, Polley, Davis, Zhu, Bilke, Walker, Pineda, Gindin, Jiang, Reinhold (b0190) 2013; 73
Lin, Tian, Wei, Hakonarson (b0100) 2022; 13
Cha, Yu, Cho, Hemberg, Lee (b0110) 2023; 51
Tang, Barbacioru, Wang, Nordman, Lee, Xu, Wang, Bodeau, Tuch, Siddiqui (b0010) 2009; 6
Hu, Huang, An, Du, Hu, Xue, Zhu, Wang, Xue, Fan (b0040) 2016; 17
Stuart, Satija (b0055) 2019; 20
Adey (b0135) 2019; 177
Moret, Pachon Angona, Cotos, Yan, Atz, Brunner, Baumgartner, Grisoni, Schneider (b0180) 2023; 14
Frei, Bava, Zunder, Hsieh, Chen, Nolan, Gherardini (b0060) 2016; 13
Zhou, Ye, Wang, Zhang (b0140) 2020; 11
Hao, Hao, Andersen-Nissen, Mauck, Zheng, Butler, Lee, Wilk, Darby, Zager (b0145) 2021; 184
Jia, Lysenko, Boroevich, Sharma, Tsunoda (b0155) 2023; 24
Ho, Jain, Abbeel (b0200) 2020; 33
Picelli, Faridani, Björklund, Winberg, Sagasser, Sandberg (b0035) 2014; 9
Chappell, Russell, Voet (b0050) 2018; 19
Kelsey, Stegle, Reik (b0015) 2017; 358
Yang, Wang, Luo, Cai, Xu, Xue, Xu (b0160) 2023; 39
Xu, Wang, Dai, Mundra, Zheng (b0130) 2021; 189
Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b0170) 2017
Holbeck, Collins, Doroshow (b0185) 2010; 9
Gayoso, Steier, Lopez, Regier, Nazor, Streets, Yosef (b0150) 2021; 18
Xu, Xu, Meng, Lu, Cai, Zeng, Nussinov, Cheng (b0095) 2023; 3
Yang, Yang, Xie, Ni, Liu, Yang, Mu, Wang (b0125) 2022; 4
Tartaglia, Pechmann, Dobson, Vendruscolo (b0090) 2009; 388
Clarke, Ressom, Wang, Xuan, Liu, Gehan, Wang (b0195) 2008; 8
R. Girshick, . Fast r-cnn, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2015, p. 1440–1448.
Athaya, Ripan, Li, Hu (b0005) 2023; 24
Cao (10.1016/j.jmb.2024.168610_b0105) 2022; 13
Abaan (10.1016/j.jmb.2024.168610_b0190) 2013; 73
Yang (10.1016/j.jmb.2024.168610_b0120) 2022; 4
Gayoso (10.1016/j.jmb.2024.168610_b0150) 2021; 18
Elizaga (10.1016/j.jmb.2024.168610_b0175) 2018; 13
Gao (10.1016/j.jmb.2024.168610_b0020) 2023; 24
Yang (10.1016/j.jmb.2024.168610_b0160) 2023; 39
Athaya (10.1016/j.jmb.2024.168610_b0005) 2023; 24
Jia (10.1016/j.jmb.2024.168610_b0155) 2023; 24
Zhou (10.1016/j.jmb.2024.168610_b0140) 2020; 11
Ho (10.1016/j.jmb.2024.168610_b0200) 2020; 33
Stuart (10.1016/j.jmb.2024.168610_b0055) 2019; 20
Vaswani (10.1016/j.jmb.2024.168610_b0170) 2017
Xu (10.1016/j.jmb.2024.168610_b0095) 2023; 3
Chappell (10.1016/j.jmb.2024.168610_b0050) 2018; 19
Dey (10.1016/j.jmb.2024.168610_b0025) 2015; 33
Hu (10.1016/j.jmb.2024.168610_b0040) 2016; 17
Clarke (10.1016/j.jmb.2024.168610_b0195) 2008; 8
Tang (10.1016/j.jmb.2024.168610_b0010) 2009; 6
Tartaglia (10.1016/j.jmb.2024.168610_b0090) 2009; 388
10.1016/j.jmb.2024.168610_b0165
Hao (10.1016/j.jmb.2024.168610_b0145) 2021; 184
Macaulay (10.1016/j.jmb.2024.168610_b0030) 2015; 12
Moret (10.1016/j.jmb.2024.168610_b0180) 2023; 14
Yang (10.1016/j.jmb.2024.168610_b0125) 2022; 4
Pott (10.1016/j.jmb.2024.168610_b0045) 2017; 6
Lin (10.1016/j.jmb.2024.168610_b0100) 2022; 13
Kalakoti (10.1016/j.jmb.2024.168610_b0085) 2023; 435
Edfors (10.1016/j.jmb.2024.168610_b0080) 2016; 12
Picelli (10.1016/j.jmb.2024.168610_b0035) 2014; 9
Frei (10.1016/j.jmb.2024.168610_b0060) 2016; 13
Lakkis (10.1016/j.jmb.2024.168610_b0075) 2022; 4
Xiong (10.1016/j.jmb.2024.168610_b0115) 2022; 13
Kelsey (10.1016/j.jmb.2024.168610_b0015) 2017; 358
Cha (10.1016/j.jmb.2024.168610_b0110) 2023; 51
Stoeckius (10.1016/j.jmb.2024.168610_b0070) 2017; 14
Holbeck (10.1016/j.jmb.2024.168610_b0185) 2010; 9
Adey (10.1016/j.jmb.2024.168610_b0135) 2019; 177
Peterson (10.1016/j.jmb.2024.168610_b0065) 2017; 35
Xu (10.1016/j.jmb.2024.168610_b0130) 2021; 189
Dhariwal (10.1016/j.jmb.2024.168610_b0205) 2021
References_xml – volume: 6
  start-page: e23203
  year: 2017
  ident: b0045
  article-title: Simultaneous measurement of chromatin accessibility, dna methylation, and nucleosome phasing in single cells
  publication-title: elife
– volume: 13
  start-page: e0202753
  year: 2018
  ident: b0175
  article-title: Safety and tolerability of hiv-1 multiantigen pdna vaccine given with il-12 plasmid dna via electroporation, boosted with a recombinant vesicular stomatitis virus hiv gag vaccine in healthy volunteers in a randomized, controlled clinical trial
  publication-title: PLOS ONE.
– volume: 9
  start-page: 1451
  year: 2010
  end-page: 1460
  ident: b0185
  article-title: Analysis of food and drug administration–approved anticancer agents in the nci60 panel of human tumor cell lines
  publication-title: Mol. Cancer Ther.
– volume: 33
  start-page: 6840
  year: 2020
  end-page: 6851
  ident: b0200
  article-title: Denoising diffusion probabilistic models
  publication-title: Adv. Neural Inf. Process. Syst. (NeurIPS)
– volume: 24
  start-page: bbad081
  year: 2023
  ident: b0020
  article-title: A universal framework for single-cell multi-omics data integration with graph convolutional networks
  publication-title: Brief. Bioinform.
– volume: 18
  start-page: 272
  year: 2021
  end-page: 282
  ident: b0150
  article-title: Joint probabilistic modeling of single-cell multi-omic data with totalvi
  publication-title: Nat. Methods.
– volume: 388
  start-page: 381
  year: 2009
  end-page: 389
  ident: b0090
  article-title: A relationship between mRNA expression levels and protein solubility in E. coli
  publication-title: J. Mol. Biol.
– volume: 13
  start-page: 269
  year: 2016
  end-page: 275
  ident: b0060
  article-title: Highly multiplexed simultaneous detection of rnas and proteins in single cells
  publication-title: Nat. Methods.
– volume: 24
  start-page: bbad266
  year: 2023
  ident: b0155
  article-title: scdeepinsight: a supervised cell-type identification method for scrna-seq data with deep learning
  publication-title: Brief. Bioinform.
– volume: 14
  start-page: 114
  year: 2023
  ident: b0180
  article-title: Leveraging molecular structure and bioactivity with chemical language models for de novo drug design
  publication-title: Nat. Commun.
– volume: 14
  start-page: 865
  year: 2017
  end-page: 868
  ident: b0070
  article-title: Simultaneous epitope and transcriptome measurement in single cells
  publication-title: Nat. Methods.
– volume: 33
  start-page: 285
  year: 2015
  end-page: 289
  ident: b0025
  article-title: Integrated genome and transcriptome sequencing of the same cell
  publication-title: Nat. Biotechnol.
– volume: 20
  start-page: 257
  year: 2019
  end-page: 272
  ident: b0055
  article-title: Integrative single-cell analysis
  publication-title: Nat. Rev. Genet.
– volume: 12
  start-page: 883
  year: 2016
  ident: b0080
  article-title: Gene-specific correlation of rna and protein levels in human cells and tissues
  publication-title: Mol. Syst. Biol.
– volume: 4
  start-page: 852
  year: 2022
  end-page: 866
  ident: b0120
  article-title: scbert as a large-scale pretrained deep language model for cell type annotation of single-cell rna-seq data
  publication-title: Nat. Mach. Intell.
– volume: 189
  start-page: 65
  year: 2021
  end-page: 73
  ident: b0130
  article-title: Ensemble learning models that predict surface protein abundance from single-cell multimodal omics data
  publication-title: Methods
– volume: 13
  start-page: 7419
  year: 2022
  ident: b0105
  article-title: A unified computational framework for single-cell data integration with optimal transport
  publication-title: Nat. Commun.
– volume: 73
  start-page: 4372
  year: 2013
  end-page: 4382
  ident: b0190
  article-title: The exomes of the nci-60 panel: a genomic resource for cancer biology and systems pharmacology
  publication-title: Cancer Res.
– volume: 358
  start-page: 69
  year: 2017
  end-page: 75
  ident: b0015
  article-title: Single-cell epigenomics: recording the past and predicting the future
  publication-title: Science
– volume: 4
  start-page: 940
  year: 2022
  end-page: 952
  ident: b0075
  article-title: A multi-use deep learning method for cite-seq and single-cell rna-seq data integration with cell surface protein prediction and imputation
  publication-title: Nat. Mach. Intell.
– volume: 184
  start-page: 3573
  year: 2021
  end-page: 3587
  ident: b0145
  article-title: Integrated analysis of multimodal single-cell data
  publication-title: Cell.
– year: 2017
  ident: b0170
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.(NeurIPS)
– volume: 24
  start-page: bbad313
  year: 2023
  ident: b0005
  article-title: Multimodal deep learning approaches for single-cell multi-omics data integration
  publication-title: Brief. Bioinform.
– volume: 177
  start-page: 1677
  year: 2019
  end-page: 1679
  ident: b0135
  article-title: Integration of single-cell genomics datasets
  publication-title: Cell.
– volume: 4
  start-page: 696
  year: 2022
  end-page: 709
  ident: b0125
  article-title: Contrastive learning enables rapid mapping to multimodal single-cell atlas of multimillion scale
  publication-title: Nat. Mach. Intell.
– volume: 51
  year: 2023
  ident: b0110
  article-title: schumannet: a single-cell network analysis platform for the study of cell-type specificity of disease genes
  publication-title: Nucl. Acids Res.
– volume: 35
  start-page: 936
  year: 2017
  end-page: 939
  ident: b0065
  article-title: Multiplexed quantification of proteins and transcripts in single cells
  publication-title: Nat. Biotechnol.
– reference: R. Girshick, . Fast r-cnn, in: Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2015, p. 1440–1448.
– volume: 8
  start-page: 37
  year: 2008
  end-page: 49
  ident: b0195
  article-title: The properties of high-dimensional data spaces: implications for exploring gene and protein expression data
  publication-title: Nat. Rev. Cancer.
– volume: 435
  start-page: 168121
  year: 2023
  ident: b0085
  article-title: Modulation of DNA-protein Interactions by Proximal Genetic Elements as Uncovered by Interpretable Deep Learning
  publication-title: J. Mol. Biol.
– volume: 13
  start-page: 7705
  year: 2022
  ident: b0100
  article-title: Clustering of single-cell multi-omics data with a multimodal deep learning method
  publication-title: Nat. Commun.
– volume: 11
  start-page: 651
  year: 2020
  ident: b0140
  article-title: Surface protein imputation from single cell transcriptomes by deep neural networks
  publication-title: Nat. Commun.
– volume: 3
  year: 2023
  ident: b0095
  article-title: Graph embedding and gaussian mixture variational autoencoder network for end-to-end analysis of single-cell rna sequencing data
  publication-title: Cell Rep. Methods
– volume: 13
  start-page: 6118
  year: 2022
  ident: b0115
  article-title: Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space
  publication-title: Nat. Commun.
– year: 2021
  ident: b0205
  article-title: Diffusion models beat gans on image synthesis
  publication-title: Adv. Neural Inf. Process. Syst.(NeurIPS)
– volume: 19
  start-page: 15
  year: 2018
  end-page: 41
  ident: b0050
  article-title: Single-cell (multi) omics technologies
  publication-title: Annu. Rev. Genom. Hum. Genet.
– volume: 6
  start-page: 377
  year: 2009
  end-page: 382
  ident: b0010
  article-title: mrna-seq whole-transcriptome analysis of a single cell
  publication-title: Nat. Methods
– volume: 17
  start-page: 1
  year: 2016
  end-page: 11
  ident: b0040
  article-title: Simultaneous profiling of transcriptome and dna methylome from a single cell
  publication-title: Genome Biol.
– volume: 9
  start-page: 171
  year: 2014
  end-page: 181
  ident: b0035
  article-title: Full-length rna-seq from single cells using smart-seq2
  publication-title: Nat. Protoc.
– volume: 12
  start-page: 519
  year: 2015
  end-page: 522
  ident: b0030
  article-title: G&t-seq: parallel sequencing of single-cell genomes and transcriptomes
  publication-title: Nat. Methods
– volume: 39
  start-page: btad596
  year: 2023
  ident: b0160
  article-title: DeepCCI: a deep learning framework for identifying cell–cell interactions from single-cell RNA sequencing data
  publication-title: Bioinformatics.
– volume: 13
  start-page: 269
  issue: 3
  year: 2016
  ident: 10.1016/j.jmb.2024.168610_b0060
  article-title: Highly multiplexed simultaneous detection of rnas and proteins in single cells
  publication-title: Nat. Methods.
  doi: 10.1038/nmeth.3742
– volume: 51
  issue: 2
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0110
  article-title: schumannet: a single-cell network analysis platform for the study of cell-type specificity of disease genes
  publication-title: Nucl. Acids Res.
  doi: 10.1093/nar/gkac1042
– volume: 4
  start-page: 696
  issue: 8
  year: 2022
  ident: 10.1016/j.jmb.2024.168610_b0125
  article-title: Contrastive learning enables rapid mapping to multimodal single-cell atlas of multimillion scale
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-022-00518-z
– ident: 10.1016/j.jmb.2024.168610_b0165
  doi: 10.1109/ICCV.2015.169
– volume: 9
  start-page: 1451
  issue: 5
  year: 2010
  ident: 10.1016/j.jmb.2024.168610_b0185
  article-title: Analysis of food and drug administration–approved anticancer agents in the nci60 panel of human tumor cell lines
  publication-title: Mol. Cancer Ther.
  doi: 10.1158/1535-7163.MCT-10-0106
– volume: 435
  start-page: 168121
  issue: 13
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0085
  article-title: Modulation of DNA-protein Interactions by Proximal Genetic Elements as Uncovered by Interpretable Deep Learning
  publication-title: J. Mol. Biol.
  doi: 10.1016/j.jmb.2023.168121
– volume: 33
  start-page: 285
  issue: 3
  year: 2015
  ident: 10.1016/j.jmb.2024.168610_b0025
  article-title: Integrated genome and transcriptome sequencing of the same cell
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.3129
– volume: 24
  start-page: bbad313
  issue: 5
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0005
  article-title: Multimodal deep learning approaches for single-cell multi-omics data integration
  publication-title: Brief. Bioinform.
  doi: 10.1093/bib/bbad313
– volume: 24
  start-page: bbad081
  issue: 3
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0020
  article-title: A universal framework for single-cell multi-omics data integration with graph convolutional networks
  publication-title: Brief. Bioinform.
  doi: 10.1093/bib/bbad081
– volume: 18
  start-page: 272
  issue: 3
  year: 2021
  ident: 10.1016/j.jmb.2024.168610_b0150
  article-title: Joint probabilistic modeling of single-cell multi-omic data with totalvi
  publication-title: Nat. Methods.
  doi: 10.1038/s41592-020-01050-x
– volume: 13
  start-page: 7419
  issue: 1
  year: 2022
  ident: 10.1016/j.jmb.2024.168610_b0105
  article-title: A unified computational framework for single-cell data integration with optimal transport
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-022-35094-8
– volume: 73
  start-page: 4372
  issue: 14
  year: 2013
  ident: 10.1016/j.jmb.2024.168610_b0190
  article-title: The exomes of the nci-60 panel: a genomic resource for cancer biology and systems pharmacology
  publication-title: Cancer Res.
  doi: 10.1158/0008-5472.CAN-12-3342
– volume: 20
  start-page: 257
  issue: 5
  year: 2019
  ident: 10.1016/j.jmb.2024.168610_b0055
  article-title: Integrative single-cell analysis
  publication-title: Nat. Rev. Genet.
  doi: 10.1038/s41576-019-0093-7
– volume: 19
  start-page: 15
  year: 2018
  ident: 10.1016/j.jmb.2024.168610_b0050
  article-title: Single-cell (multi) omics technologies
  publication-title: Annu. Rev. Genom. Hum. Genet.
  doi: 10.1146/annurev-genom-091416-035324
– volume: 11
  start-page: 651
  issue: 1
  year: 2020
  ident: 10.1016/j.jmb.2024.168610_b0140
  article-title: Surface protein imputation from single cell transcriptomes by deep neural networks
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-020-14391-0
– volume: 12
  start-page: 519
  issue: 6
  year: 2015
  ident: 10.1016/j.jmb.2024.168610_b0030
  article-title: G&t-seq: parallel sequencing of single-cell genomes and transcriptomes
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.3370
– volume: 8
  start-page: 37
  issue: 1
  year: 2008
  ident: 10.1016/j.jmb.2024.168610_b0195
  article-title: The properties of high-dimensional data spaces: implications for exploring gene and protein expression data
  publication-title: Nat. Rev. Cancer.
  doi: 10.1038/nrc2294
– volume: 189
  start-page: 65
  year: 2021
  ident: 10.1016/j.jmb.2024.168610_b0130
  article-title: Ensemble learning models that predict surface protein abundance from single-cell multimodal omics data
  publication-title: Methods
  doi: 10.1016/j.ymeth.2020.10.001
– volume: 177
  start-page: 1677
  issue: 7
  year: 2019
  ident: 10.1016/j.jmb.2024.168610_b0135
  article-title: Integration of single-cell genomics datasets
  publication-title: Cell.
  doi: 10.1016/j.cell.2019.05.034
– volume: 4
  start-page: 940
  issue: 11
  year: 2022
  ident: 10.1016/j.jmb.2024.168610_b0075
  article-title: A multi-use deep learning method for cite-seq and single-cell rna-seq data integration with cell surface protein prediction and imputation
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-022-00545-w
– volume: 358
  start-page: 69
  issue: 6359
  year: 2017
  ident: 10.1016/j.jmb.2024.168610_b0015
  article-title: Single-cell epigenomics: recording the past and predicting the future
  publication-title: Science
  doi: 10.1126/science.aan6826
– volume: 4
  start-page: 852
  issue: 10
  year: 2022
  ident: 10.1016/j.jmb.2024.168610_b0120
  article-title: scbert as a large-scale pretrained deep language model for cell type annotation of single-cell rna-seq data
  publication-title: Nat. Mach. Intell.
  doi: 10.1038/s42256-022-00534-z
– volume: 388
  start-page: 381
  issue: 2
  year: 2009
  ident: 10.1016/j.jmb.2024.168610_b0090
  article-title: A relationship between mRNA expression levels and protein solubility in E. coli
  publication-title: J. Mol. Biol.
  doi: 10.1016/j.jmb.2009.03.002
– volume: 24
  start-page: bbad266
  issue: 5
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0155
  article-title: scdeepinsight: a supervised cell-type identification method for scrna-seq data with deep learning
  publication-title: Brief. Bioinform.
  doi: 10.1093/bib/bbad266
– volume: 6
  start-page: e23203
  year: 2017
  ident: 10.1016/j.jmb.2024.168610_b0045
  article-title: Simultaneous measurement of chromatin accessibility, dna methylation, and nucleosome phasing in single cells
  publication-title: elife
  doi: 10.7554/eLife.23203
– volume: 13
  start-page: e0202753
  issue: 9
  year: 2018
  ident: 10.1016/j.jmb.2024.168610_b0175
  article-title: Safety and tolerability of hiv-1 multiantigen pdna vaccine given with il-12 plasmid dna via electroporation, boosted with a recombinant vesicular stomatitis virus hiv gag vaccine in healthy volunteers in a randomized, controlled clinical trial
  publication-title: PLOS ONE.
  doi: 10.1371/journal.pone.0202753
– volume: 14
  start-page: 114
  issue: 1
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0180
  article-title: Leveraging molecular structure and bioactivity with chemical language models for de novo drug design
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-022-35692-6
– volume: 184
  start-page: 3573
  issue: 13
  year: 2021
  ident: 10.1016/j.jmb.2024.168610_b0145
  article-title: Integrated analysis of multimodal single-cell data
  publication-title: Cell.
  doi: 10.1016/j.cell.2021.04.048
– volume: 14
  start-page: 865
  issue: 9
  year: 2017
  ident: 10.1016/j.jmb.2024.168610_b0070
  article-title: Simultaneous epitope and transcriptome measurement in single cells
  publication-title: Nat. Methods.
  doi: 10.1038/nmeth.4380
– volume: 39
  start-page: btad596
  issue: 10
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0160
  article-title: DeepCCI: a deep learning framework for identifying cell–cell interactions from single-cell RNA sequencing data
  publication-title: Bioinformatics.
  doi: 10.1093/bioinformatics/btad596
– volume: 33
  start-page: 6840
  year: 2020
  ident: 10.1016/j.jmb.2024.168610_b0200
  article-title: Denoising diffusion probabilistic models
  publication-title: Adv. Neural Inf. Process. Syst. (NeurIPS)
– volume: 17
  start-page: 1
  issue: 1
  year: 2016
  ident: 10.1016/j.jmb.2024.168610_b0040
  article-title: Simultaneous profiling of transcriptome and dna methylome from a single cell
  publication-title: Genome Biol.
  doi: 10.1186/s13059-016-0950-z
– volume: 13
  start-page: 6118
  issue: 1
  year: 2022
  ident: 10.1016/j.jmb.2024.168610_b0115
  article-title: Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-022-33758-z
– volume: 12
  start-page: 883
  issue: 10
  year: 2016
  ident: 10.1016/j.jmb.2024.168610_b0080
  article-title: Gene-specific correlation of rna and protein levels in human cells and tissues
  publication-title: Mol. Syst. Biol.
  doi: 10.15252/msb.20167144
– volume: 3
  issue: 1
  year: 2023
  ident: 10.1016/j.jmb.2024.168610_b0095
  article-title: Graph embedding and gaussian mixture variational autoencoder network for end-to-end analysis of single-cell rna sequencing data
  publication-title: Cell Rep. Methods
– volume: 9
  start-page: 171
  issue: 1
  year: 2014
  ident: 10.1016/j.jmb.2024.168610_b0035
  article-title: Full-length rna-seq from single cells using smart-seq2
  publication-title: Nat. Protoc.
  doi: 10.1038/nprot.2014.006
– volume: 35
  start-page: 936
  issue: 10
  year: 2017
  ident: 10.1016/j.jmb.2024.168610_b0065
  article-title: Multiplexed quantification of proteins and transcripts in single cells
  publication-title: Nat. Biotechnol.
  doi: 10.1038/nbt.3973
– year: 2017
  ident: 10.1016/j.jmb.2024.168610_b0170
  article-title: Attention is all you need
  publication-title: Adv. Neural Inf. Process. Syst.(NeurIPS)
– year: 2021
  ident: 10.1016/j.jmb.2024.168610_b0205
  article-title: Diffusion models beat gans on image synthesis
– volume: 6
  start-page: 377
  issue: 5
  year: 2009
  ident: 10.1016/j.jmb.2024.168610_b0010
  article-title: mrna-seq whole-transcriptome analysis of a single cell
  publication-title: Nat. Methods
  doi: 10.1038/nmeth.1315
– volume: 13
  start-page: 7705
  issue: 1
  year: 2022
  ident: 10.1016/j.jmb.2024.168610_b0100
  article-title: Clustering of single-cell multi-omics data with a multimodal deep learning method
  publication-title: Nat. Commun.
  doi: 10.1038/s41467-022-35031-9
SSID ssj0005348
Score 2.4567497
Snippet [Display omitted] •Considering both RNA and protein expression could obtain more biological evidence.•We propose a method for predicting protein expression...
The executors of organismal functions are proteins, and the transition from RNA to protein is subject to post-transcriptional regulation; therefore,...
SourceID proquest
pubmed
crossref
elsevier
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 168610
SubjectTerms Algorithms
CITE-seq technology
Computational Biology - methods
data collection
deep learning
drugs
epitopes
Gaussian noise
Gene Expression Profiling - methods
generating predictions
Humans
Membrane Proteins - genetics
Membrane Proteins - metabolism
molecular biology
multiomics
neoplasm cells
prediction
protein synthesis
RNA
Single-Cell Analysis - methods
surface proteins
Technical noise
therapeutics
Transcriptome
Title scDM: A deep generative method for cell surface protein prediction with diffusion model
URI https://dx.doi.org/10.1016/j.jmb.2024.168610
https://www.ncbi.nlm.nih.gov/pubmed/38754773
https://www.proquest.com/docview/3056668997
https://www.proquest.com/docview/3153784535
Volume 436
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8QwEB4WRfQivl1fRPAkVLdN2jTellVZFT0p7i0kbSq7aHfZx8GLv92ZPkRB9-CptE0hzKQz8yUz3wCcpOTzXKQ8mQoEKIJnnnKp8KzNYqES9FgFb8H9Q9R9Ere9sNeATl0LQ2mVle0vbXphrasn55U0z0f9PtX40u4FGmBBMU9IhebEXodr-uzjW5oHF3HNGE6j65PNIsdr8GYRIgbizI_iiIpof_dNf8WehQ-6XoPVKnhk7XJ-69Bw-QYsle0k3zdguVN3b9uE50lyeX_B2ix1bsReCnJpsmysbBnNMFZltGnPJrNxZhLHCsKGfo5XOrohdTHao2XUQWVGW2qsaJqzBU_XV4-drlc1UfASHkdTzxhp8aeUFpGKa2Uu4RYxXWBcq2UynpgQAUdmA6pqUr7Fd0GGqMSG1g9cGinJt2EhH-ZuFxgGIz6iKyWE4UJJo1RK7HkOY0zl0Dg0oVWLTycVwzg1unjVdSrZQKPENUlclxJvwunXJ6OSXmPeYFHrRP9YIxrN_7zPjmv9aVQCydbkbjibaIJPUYSIU84Zgy5BxiLkYRN2SuV_zZQj2BNS8r3_TWwfVuiOEs_88AAWpuOZO8QQZ2qPijV8BIvtm7vuwyfvyvav
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT8MwDLZgCMEF8WY8g8QJqbA2adNwmwZoPLYTCG5R0qZoCMrEtgP_HrsPEBLswKlSm0iRndr-EvszwFFKPs9FypOpQIAieOYplwrP2iwWKkGPVfAW9PpR915cP4aPM9Cpa2EorbKy_aVNL6x19ea0kubpcDCgGl86vUADLCjmCcNZmCN2KtGAufbVTbf_nenBRVyThtOE-nKzSPN6frWIEgNx4kdxRHW0v7unv8LPwg1dLsNSFT-ydrnEFZhx-SrMlx0lP1ZhoVM3cFuDh1Fy3jtjbZY6N2RPBb80GTdWdo1mGK4yOrdno8l7ZhLHCs6GQY5Pur0hjTE6pmXURGVCp2qs6JuzDveXF3edrlf1UfASHkdjzxhp8b-UFsGKa2Uu4RZhXWBcq2UynpgQMUdmAypsUr7Fb0GGwMSG1g9cGinJN6CRv-VuCxjGIz4CLCWE4UJJo1RKBHoOw0zl0D40oVWLTycVyTj1unjRdTbZs0aJa5K4LiXehOOvKcOSYWPaYFHrRP_YJho9wLRph7X-NCqBZGty9zYZaUJQUYSgU04Zg15BxiLkYRM2S-V_rZQj3hNS8u3_LewAFrp3vVt9e9W_2YFF-kJ5aH64C43x-8TtYcQztvvVjv4EbCH5YA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=scDM%3A+A+deep+generative+method+for+cell+surface+protein+prediction+with+diffusion+model&rft.jtitle=Journal+of+molecular+biology&rft.au=Yu%2C+Hanlei&rft.au=Zheng%2C+Yuanjie&rft.au=Yang%2C+Xinbo&rft.date=2024-06-15&rft.issn=0022-2836&rft.volume=436&rft.issue=12&rft.spage=168610&rft_id=info:doi/10.1016%2Fj.jmb.2024.168610&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jmb_2024_168610
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0022-2836&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0022-2836&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0022-2836&client=summon