MolGPT: Molecular Generation Using a Transformer-Decoder Model

Application of deep learning techniques for generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language pr...

Full description

Saved in:
Bibliographic Details
Published inJournal of chemical information and modeling Vol. 62; no. 9; pp. 2064 - 2076
Main Authors Bagal, Viraj, Aggarwal, Rishal, Vinod, P. K., Priyakumar, U. Deva
Format Journal Article
LanguageEnglish
Published United States American Chemical Society 09.05.2022
Subjects
Online AccessGet full text
ISSN1549-9596
1549-960X
1549-960X
DOI10.1021/acs.jcim.1c00600

Cover

Loading…
Abstract Application of deep learning techniques for generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language processing, such as Transformers, for molecular design in general. Inspired by generative pre-training (GPT) models that have been shown to be successful in generating meaningful text, we train a transformer-decoder on the next token prediction task using masked self-attention for the generation of druglike molecules in this study. We show that our model, MolGPT, performs on par with other previously proposed modern machine learning frameworks for molecular generation in terms of generating valid, unique, and novel molecules. Furthermore, we demonstrate that the model can be trained conditionally to control multiple properties of the generated molecules. We also show that the model can be used to generate molecules with desired scaffolds as well as desired molecular properties by conditioning the generation on scaffold SMILES strings of desired scaffolds and property values. Using saliency maps, we highlight the interpretability of the generative process of the model.
AbstractList Application of deep learning techniques for generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language processing, such as Transformers, for molecular design in general. Inspired by generative pre-training (GPT) models that have been shown to be successful in generating meaningful text, we train a transformer-decoder on the next token prediction task using masked self-attention for the generation of druglike molecules in this study. We show that our model, MolGPT, performs on par with other previously proposed modern machine learning frameworks for molecular generation in terms of generating valid, unique, and novel molecules. Furthermore, we demonstrate that the model can be trained conditionally to control multiple properties of the generated molecules. We also show that the model can be used to generate molecules with desired scaffolds as well as desired molecular properties by conditioning the generation on scaffold SMILES strings of desired scaffolds and property values. Using saliency maps, we highlight the interpretability of the generative process of the model.
Application of deep learning techniques for de novo generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language processing, such as Transformers, for molecular design in general. Inspired by generative pre-training (GPT) models that have been shown to be successful in generating meaningful text, we train a transformer-decoder on the next token prediction task using masked self-attention for the generation of druglike molecules in this study. We show that our model, MolGPT, performs on par with other previously proposed modern machine learning frameworks for molecular generation in terms of generating valid, unique, and novel molecules. Furthermore, we demonstrate that the model can be trained conditionally to control multiple properties of the generated molecules. We also show that the model can be used to generate molecules with desired scaffolds as well as desired molecular properties by conditioning the generation on scaffold SMILES strings of desired scaffolds and property values. Using saliency maps, we highlight the interpretability of the generative process of the model.
Application of deep learning techniques for de novo generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language processing, such as Transformers, for molecular design in general. Inspired by generative pre-training (GPT) models that have been shown to be successful in generating meaningful text, we train a transformer-decoder on the next token prediction task using masked self-attention for the generation of druglike molecules in this study. We show that our model, MolGPT, performs on par with other previously proposed modern machine learning frameworks for molecular generation in terms of generating valid, unique, and novel molecules. Furthermore, we demonstrate that the model can be trained conditionally to control multiple properties of the generated molecules. We also show that the model can be used to generate molecules with desired scaffolds as well as desired molecular properties by conditioning the generation on scaffold SMILES strings of desired scaffolds and property values. Using saliency maps, we highlight the interpretability of the generative process of the model.Application of deep learning techniques for de novo generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language processing, such as Transformers, for molecular design in general. Inspired by generative pre-training (GPT) models that have been shown to be successful in generating meaningful text, we train a transformer-decoder on the next token prediction task using masked self-attention for the generation of druglike molecules in this study. We show that our model, MolGPT, performs on par with other previously proposed modern machine learning frameworks for molecular generation in terms of generating valid, unique, and novel molecules. Furthermore, we demonstrate that the model can be trained conditionally to control multiple properties of the generated molecules. We also show that the model can be used to generate molecules with desired scaffolds as well as desired molecular properties by conditioning the generation on scaffold SMILES strings of desired scaffolds and property values. Using saliency maps, we highlight the interpretability of the generative process of the model.
Author Bagal, Viraj
Vinod, P. K.
Aggarwal, Rishal
Priyakumar, U. Deva
Author_xml – sequence: 1
  givenname: Viraj
  surname: Bagal
  fullname: Bagal, Viraj
  organization: International Institute of Information Technology, Hyderabad 500 032, India, Indian Institute of Science Education and Research, Pune 411 008, India
– sequence: 2
  givenname: Rishal
  surname: Aggarwal
  fullname: Aggarwal, Rishal
  organization: International Institute of Information Technology, Hyderabad 500 032, India
– sequence: 3
  givenname: P. K.
  surname: Vinod
  fullname: Vinod, P. K.
  organization: International Institute of Information Technology, Hyderabad 500 032, India
– sequence: 4
  givenname: U. Deva
  orcidid: 0000-0001-7114-3955
  surname: Priyakumar
  fullname: Priyakumar, U. Deva
  organization: International Institute of Information Technology, Hyderabad 500 032, India
BackLink https://www.ncbi.nlm.nih.gov/pubmed/34694798$$D View this record in MEDLINE/PubMed
BookMark eNp1kT1PwzAYhC1URD9gZ0KRWFhS_BE7NgMSKlCQimBoJTbLcWyUKomLnQz8e1zaLpWY7h2eO53eG4NB61oDwCWCUwQxulU6TNe6aqZIQ8ggPAEjRDORCgY_B4ebCjYE4xDWEBIiGD4DQ5IxkeWCj8D9m6vnH8u7JKrRfa18Mjet8aqrXJusQtV-JSpZetUG63xjfPpotCuNj4bS1Ofg1Ko6mIu9TsDq-Wk5e0kX7_PX2cMi1QTnXco5QozZPBOFUCXPFGWZKkpsIbWGCF3GkqqwhVWQc4yR5YSWnHOSc6pgSckE3OxyN9599yZ0sqmCNnWtWuP6IDHlLKMYIRLR6yN07XrfxnYSM8YEywhEkbraU33RmFJufNUo_yMPn4kA2wHauxC8sVJX3d9XOq-qWiIotxPIOIHcTiD3E0QjPDIesv-1_AL16Ikv
CitedBy_id crossref_primary_10_1016_j_ymeth_2025_03_001
crossref_primary_10_1021_acs_jcim_4c01792
crossref_primary_10_1093_bioinformatics_btad222
crossref_primary_10_1186_s13321_025_00976_8
crossref_primary_10_1186_s13321_024_00887_0
crossref_primary_10_1021_acs_jcim_4c01309
crossref_primary_10_1016_j_compbiomed_2023_107285
crossref_primary_10_1186_s13321_023_00791_z
crossref_primary_10_1021_acsenergylett_4c02086
crossref_primary_10_1093_bioadv_vbae099
crossref_primary_10_1016_j_compbiomed_2024_109403
crossref_primary_10_1021_acs_est_4c08298
crossref_primary_10_1093_bfgp_elad012
crossref_primary_10_1021_acs_jcim_3c01456
crossref_primary_10_1080_10643389_2025_2469868
crossref_primary_10_3389_fphar_2024_1458739
crossref_primary_10_1016_j_chaos_2024_115105
crossref_primary_10_1038_s42256_024_00808_8
crossref_primary_10_1038_s41467_024_49388_6
crossref_primary_10_1002_sdtp_18036
crossref_primary_10_1021_acs_jcim_3c01562
crossref_primary_10_1039_D4SC03921A
crossref_primary_10_1002_minf_202200215
crossref_primary_10_1093_bib_bbad327
crossref_primary_10_7717_peerj_cs_2222
crossref_primary_10_1002_wcms_1681
crossref_primary_10_26599_BDMA_2023_9020009
crossref_primary_10_1002_smll_202204941
crossref_primary_10_1038_s41467_024_50469_9
crossref_primary_10_1007_s12539_024_00681_4
crossref_primary_10_1021_acs_jcim_4c01669
crossref_primary_10_1021_acs_jcim_4c01787
crossref_primary_10_1038_s41598_023_43046_5
crossref_primary_10_1186_s13321_024_00936_8
crossref_primary_10_1038_s41467_025_56349_0
crossref_primary_10_1109_ACCESS_2023_3285811
crossref_primary_10_1021_jacs_4c11686
crossref_primary_10_1186_s13321_023_00727_7
crossref_primary_10_1038_s42256_023_00639_z
crossref_primary_10_1038_s41598_023_45385_9
crossref_primary_10_1109_TCBB_2024_3349990
crossref_primary_10_3390_molecules30061262
crossref_primary_10_1039_D4DD00084F
crossref_primary_10_1186_s40537_022_00663_7
crossref_primary_10_1038_s41524_024_01470_9
crossref_primary_10_1093_bib_bbac418
crossref_primary_10_3390_molecules29020495
crossref_primary_10_1021_acs_jcim_3c02070
crossref_primary_10_1039_D1CC07035E
crossref_primary_10_1021_acs_jpca_3c02179
crossref_primary_10_1016_j_jare_2025_02_030
crossref_primary_10_1021_acs_jcim_4c00791
crossref_primary_10_1093_bib_bbae275
crossref_primary_10_1016_j_ccr_2025_216602
crossref_primary_10_1007_s12539_024_00623_0
crossref_primary_10_1007_s11433_024_2469_4
crossref_primary_10_1021_acs_jcim_2c01618
crossref_primary_10_1093_bib_bbad185
crossref_primary_10_12688_f1000research_130936_1
crossref_primary_10_12688_f1000research_130936_2
crossref_primary_10_1016_j_csbj_2023_05_001
crossref_primary_10_1021_acsmedchemlett_2c00515
crossref_primary_10_1021_acs_jmedchem_4c02462
crossref_primary_10_1109_TCBB_2024_3477592
crossref_primary_10_1109_JETCAS_2024_3477976
crossref_primary_10_1093_bib_bbae186
crossref_primary_10_4018_JOEUC_316124
crossref_primary_10_1016_j_compbiomed_2024_108486
crossref_primary_10_1016_j_sbi_2025_102990
crossref_primary_10_1021_acs_jcim_4c00525
crossref_primary_10_1039_D4MD00423J
crossref_primary_10_1039_D4DD00014E
crossref_primary_10_1021_acs_jcim_4c02264
crossref_primary_10_1039_D4DD00135D
crossref_primary_10_1021_acs_jpclett_4c03128
crossref_primary_10_1016_j_eswa_2023_122949
crossref_primary_10_1016_j_egyai_2024_100361
crossref_primary_10_1186_s13321_023_00696_x
crossref_primary_10_1186_s13321_024_00863_8
crossref_primary_10_1002_advs_202304305
crossref_primary_10_1016_j_ejmech_2024_116735
crossref_primary_10_1038_s41524_024_01466_5
crossref_primary_10_1016_j_ymeth_2024_01_009
crossref_primary_10_3390_ijms24021146
crossref_primary_10_1063_5_0157644
crossref_primary_10_1016_j_drudis_2024_104067
crossref_primary_10_1177_14727978251321985
crossref_primary_10_1038_s42256_023_00636_2
crossref_primary_10_1093_bioinformatics_btad059
crossref_primary_10_1016_j_neunet_2024_106207
crossref_primary_10_3390_ph18030282
crossref_primary_10_2139_ssrn_4779486
crossref_primary_10_1093_bib_bbac303
crossref_primary_10_1016_j_eswa_2023_123127
crossref_primary_10_1038_s42256_025_00982_3
crossref_primary_10_1021_acs_energyfuels_3c01186
crossref_primary_10_1016_j_aichem_2024_100074
crossref_primary_10_1039_D4DD00076E
crossref_primary_10_1186_s42162_024_00411_6
crossref_primary_10_1021_acs_jcim_4c01396
crossref_primary_10_1016_j_aichem_2024_100072
crossref_primary_10_1002_prep_202200264
crossref_primary_10_1063_5_0131067
crossref_primary_10_1016_j_aichem_2024_100070
crossref_primary_10_1021_acs_analchem_2c05817
crossref_primary_10_1021_acs_jcim_3c00536
crossref_primary_10_1109_TKDE_2024_3469578
crossref_primary_10_1007_s10462_024_10775_6
crossref_primary_10_1007_s10462_024_10714_5
crossref_primary_10_1080_17460441_2023_2134340
crossref_primary_10_1007_s11030_024_10942_5
crossref_primary_10_1093_bioinformatics_btad519
crossref_primary_10_1016_j_jpha_2025_101257
crossref_primary_10_1016_j_compbiomed_2025_109740
crossref_primary_10_1038_s42003_024_06746_w
crossref_primary_10_1039_D4DD00019F
crossref_primary_10_1186_s13321_025_00984_8
crossref_primary_10_1016_j_patter_2023_100678
crossref_primary_10_1038_s41598_025_86840_z
crossref_primary_10_1039_D4SC03744H
crossref_primary_10_1080_07391102_2023_2234039
crossref_primary_10_1016_j_compchemeng_2024_108989
crossref_primary_10_1039_D3CS00287J
crossref_primary_10_1016_j_sbi_2023_102537
crossref_primary_10_1002_cjce_25525
crossref_primary_10_1016_j_compbiolchem_2023_107911
crossref_primary_10_1007_s11219_024_09671_7
crossref_primary_10_1016_j_eswa_2024_125410
crossref_primary_10_1186_s13662_025_03871_6
crossref_primary_10_1186_s13321_023_00719_7
crossref_primary_10_1039_D4DD00074A
crossref_primary_10_3934_era_2024098
crossref_primary_10_1002_wcms_1725
crossref_primary_10_1186_s13321_023_00711_1
crossref_primary_10_1038_s41598_024_61124_0
crossref_primary_10_1021_acs_chemmater_4c02726
crossref_primary_10_3390_ijms25116186
crossref_primary_10_1007_s13132_024_01814_2
crossref_primary_10_1038_s42256_024_00821_x
crossref_primary_10_1021_jacsau_4c00066
crossref_primary_10_1007_s11030_023_10771_y
crossref_primary_10_1007_s12039_023_02196_9
crossref_primary_10_1016_j_jare_2025_02_011
crossref_primary_10_1021_acs_jcim_4c02230
crossref_primary_10_1016_j_patter_2024_100947
crossref_primary_10_1088_2632_2153_acadcd
crossref_primary_10_3390_app142411526
crossref_primary_10_1021_acs_jcim_3c00293
crossref_primary_10_1016_j_ailsci_2023_100064
crossref_primary_10_1002_prep_202300109
crossref_primary_10_1109_TKDE_2024_3393356
crossref_primary_10_1080_17460441_2024_2367014
crossref_primary_10_1007_s40843_024_2851_9
crossref_primary_10_1038_s41467_023_42870_7
crossref_primary_10_1093_bib_bbad235
crossref_primary_10_1016_j_pscia_2024_100050
crossref_primary_10_3389_fphar_2024_1331062
crossref_primary_10_1021_acs_jcim_4c01907
crossref_primary_10_1109_ACCESS_2024_3446663
crossref_primary_10_1016_j_copbio_2024_103175
crossref_primary_10_1016_j_solener_2023_112115
crossref_primary_10_1021_acs_jcim_3c00969
crossref_primary_10_1016_j_ces_2023_119188
crossref_primary_10_1016_j_compchemeng_2024_108895
crossref_primary_10_7717_peerj_cs_2564
crossref_primary_10_1093_bioadv_vbad001
crossref_primary_10_1080_17460441_2024_2313475
crossref_primary_10_1002_minf_202300288
crossref_primary_10_1038_s41589_024_01679_1
crossref_primary_10_1021_acs_jcim_3c01250
crossref_primary_10_1109_TAI_2024_3387402
crossref_primary_10_1021_acs_jcim_3c01496
crossref_primary_10_1093_bib_bbad467
crossref_primary_10_1016_j_artmed_2024_102827
crossref_primary_10_1016_j_compbiomed_2023_106721
crossref_primary_10_1038_s41598_023_42952_y
crossref_primary_10_1109_TITS_2024_3355211
crossref_primary_10_1002_pmic_202300011
crossref_primary_10_1021_acs_jcim_4c01691
crossref_primary_10_1038_s41467_024_45102_8
crossref_primary_10_1038_s42004_024_01341_w
crossref_primary_10_1016_j_compbiolchem_2025_108392
crossref_primary_10_1016_j_jhazmat_2025_137232
crossref_primary_10_1038_s41524_025_01538_0
crossref_primary_10_1021_acs_jpclett_2c00624
crossref_primary_10_3390_ijms242316761
crossref_primary_10_1021_acs_jcim_2c00390
crossref_primary_10_1186_s13321_022_00646_z
crossref_primary_10_1038_s41598_024_81189_1
crossref_primary_10_1039_D3DD00095H
crossref_primary_10_1021_acs_iecr_3c02305
crossref_primary_10_1186_s13321_024_00916_y
crossref_primary_10_1109_ACCESS_2024_3367715
crossref_primary_10_1002_jcc_27315
crossref_primary_10_1093_nsr_nwaf028
crossref_primary_10_3389_fphar_2021_827606
crossref_primary_10_1145_3715318
crossref_primary_10_1002_wcms_1637
Cites_doi 10.3390/biom8040131
10.1186/s13321-018-0286-7
10.1109/ICCV.2017.244
10.1021/acs.jcim.8b00839
10.1021/ci049714+
10.1016/j.drudis.2018.01.039
10.1186/s13321-019-0404-1
10.1038/nchem.1243
10.1126/science.aat2663
10.1039/C9SC01928F
10.1093/nar/gkw1074
10.1021/acs.molpharmaceut.7b00346
10.1038/s42256-020-0174-5
10.3389/fphar.2020.565644
10.1021/acscentsci.7b00572
10.1186/s13321-020-00441-8
10.1093/nar/gkv951
10.1039/D0CP03508D
10.1021/acscentsci.9b00576
10.1007/978-3-030-01418-6_41
10.1021/acs.molpharmaceut.8b00839
10.1021/acs.jcim.8b00234
10.1126/sciadv.aap7885
10.1021/acs.jcim.9b00694
10.1021/acs.jcim.7b00690
10.1021/acs.molpharmaceut.7b01137
10.1186/1758-2946-1-8
10.1007/s10822-013-9672-4
10.1021/js9804011
10.1038/s41524-018-0128-1
10.1186/s13321-017-0235-x
10.1186/s13321-019-0397-9
10.1186/s13321-019-0393-0
10.1609/aaai.v34i01.5433
10.1021/jm9602928
10.1021/acscentsci.7b00512
10.1002/minf.201700111
10.1039/D1SC02783B
10.1021/ci00057a005
ContentType Journal Article
Copyright Copyright American Chemical Society May 9, 2022
Copyright_xml – notice: Copyright American Chemical Society May 9, 2022
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SR
7U5
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
7X8
DOI 10.1021/acs.jcim.1c00600
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Computer and Information Systems Abstracts
Engineered Materials Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
Materials Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Materials Research Database
Engineered Materials Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Solid State and Superconductivity Abstracts
Advanced Technologies Database with Aerospace
METADEX
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList MEDLINE
Materials Research Database
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Chemistry
EISSN 1549-960X
EndPage 2076
ExternalDocumentID 34694798
10_1021_acs_jcim_1c00600
Genre Research Support, Non-U.S. Gov't
Journal Article
Review
GroupedDBID ---
-~X
4.4
55A
5GY
5VS
7~N
AABXI
AAYXX
ABBLG
ABJNI
ABLBI
ABMVS
ABQRX
ABUCX
ACGFS
ACIWK
ACNCT
ACS
ADHLV
AEESW
AENEX
AFEFF
AHGAQ
ALMA_UNASSIGNED_HOLDINGS
AQSVZ
CITATION
CUPRZ
D0L
DU5
EBS
ED~
F5P
GGK
GNL
IH9
JG~
P2P
PQQKQ
RNS
ROL
UI2
VF5
VG9
W1F
CGR
CUY
CVF
ECM
EIF
NPM
7SC
7SR
7U5
8BQ
8FD
JG9
JQ2
L7M
L~C
L~D
7X8
ID FETCH-LOGICAL-c327t-881166f749b9ad84a564abd2f05fe39cd959abfbfa088221f835d8883785a0d53
IEDL.DBID ACS
ISSN 1549-9596
1549-960X
IngestDate Fri Jul 11 04:31:42 EDT 2025
Mon Jun 30 10:53:41 EDT 2025
Thu Jan 02 22:53:52 EST 2025
Thu Apr 24 23:05:57 EDT 2025
Tue Jul 01 03:04:46 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 9
Language English
License https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
https://doi.org/10.15223/policy-045
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c327t-881166f749b9ad84a564abd2f05fe39cd959abfbfa088221f835d8883785a0d53
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Review-3
content type line 23
ORCID 0000-0001-7114-3955
PMID 34694798
PQID 2666964301
PQPubID 28739
PageCount 13
ParticipantIDs proquest_miscellaneous_2586452113
proquest_journals_2666964301
pubmed_primary_34694798
crossref_citationtrail_10_1021_acs_jcim_1c00600
crossref_primary_10_1021_acs_jcim_1c00600
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2022-05-09
20220509
PublicationDateYYYYMMDD 2022-05-09
PublicationDate_xml – month: 05
  year: 2022
  text: 2022-05-09
  day: 09
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Washington
PublicationTitle Journal of chemical information and modeling
PublicationTitleAlternate J Chem Inf Model
PublicationYear 2022
Publisher American Chemical Society
Publisher_xml – name: American Chemical Society
References ref9/cit9
ref45/cit45
Karras T. (ref4/cit4) 2019
Guimaraes G. L. (ref31/cit31) 2017
Liu Q. (ref17/cit17) 2018; 31
Jin W. (ref20/cit20) 2018
ref16/cit16
Bjerrum E. J. (ref27/cit27) 2017
ref52/cit52
Sanchez-Lengeling B. (ref32/cit32) 2017
ref23/cit23
ref8/cit8
ref2/cit2
ref34/cit34
Devlin J. (ref6/cit6) 2018
ref37/cit37
Landrum G. (ref48/cit48) 2013
ref10/cit10
Goodfellow I. (ref3/cit3) 2014; 27
ref35/cit35
ref53/cit53
ref19/cit19
ref21/cit21
ref42/cit42
ref46/cit46
De Cao N. (ref33/cit33) 2018
ref49/cit49
ref13/cit13
ref24/cit24
ref38/cit38
ref50/cit50
ref54/cit54
ref36/cit36
ref11/cit11
ref25/cit25
ref29/cit29
Kusner M. J. (ref18/cit18) 2017
ref39/cit39
ref14/cit14
ref51/cit51
ref28/cit28
ref40/cit40
ref26/cit26
Vaswani A. (ref5/cit5) 2017
Lim J. (ref43/cit43) 2019
ref12/cit12
ref15/cit15
ref41/cit41
ref22/cit22
ref30/cit30
ref47/cit47
ref1/cit1
ref44/cit44
ref7/cit7
References_xml – ident: ref28/cit28
  doi: 10.3390/biom8040131
– volume-title: ChemRxiv
  year: 2017
  ident: ref32/cit32
– ident: ref21/cit21
  doi: 10.1186/s13321-018-0286-7
– ident: ref40/cit40
  doi: 10.1109/ICCV.2017.244
– ident: ref44/cit44
– ident: ref9/cit9
  doi: 10.1021/acs.jcim.8b00839
– start-page: 4401
  year: 2019
  ident: ref4/cit4
  publication-title: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
– start-page: arXiv:1905.1363
  year: 2019
  ident: ref43/cit43
  publication-title: arXiv preprint
– ident: ref46/cit46
  doi: 10.1021/ci049714+
– start-page: 5998
  year: 2017
  ident: ref5/cit5
  publication-title: Advances in Neural Information Processing Systems
– ident: ref7/cit7
  doi: 10.1016/j.drudis.2018.01.039
– start-page: arXiv:1805.1197
  year: 2018
  ident: ref33/cit33
  publication-title: arXiv preprint
– ident: ref39/cit39
  doi: 10.1186/s13321-019-0404-1
– ident: ref51/cit51
  doi: 10.1038/nchem.1243
– ident: ref8/cit8
  doi: 10.1126/science.aat2663
– start-page: arXiv:1703.0707
  year: 2017
  ident: ref27/cit27
  publication-title: arXiv preprint
– ident: ref37/cit37
  doi: 10.1039/C9SC01928F
– ident: ref47/cit47
  doi: 10.1093/nar/gkw1074
– ident: ref22/cit22
  doi: 10.1021/acs.molpharmaceut.7b00346
– ident: ref41/cit41
  doi: 10.1038/s42256-020-0174-5
– volume: 31
  start-page: 7795
  year: 2018
  ident: ref17/cit17
  publication-title: Advances in neural information processing systems
– start-page: arXiv:1705.1084
  year: 2017
  ident: ref31/cit31
  publication-title: arXiv preprint
– ident: ref10/cit10
  doi: 10.3389/fphar.2020.565644
– ident: ref35/cit35
  doi: 10.1021/acscentsci.7b00572
– ident: ref42/cit42
  doi: 10.1186/s13321-020-00441-8
– start-page: arXiv:1703.0192
  year: 2017
  ident: ref18/cit18
  publication-title: arXiv preprint
– ident: ref2/cit2
  doi: 10.1093/nar/gkv951
– ident: ref29/cit29
  doi: 10.1039/D0CP03508D
– ident: ref53/cit53
  doi: 10.1021/acscentsci.9b00576
– ident: ref19/cit19
  doi: 10.1007/978-3-030-01418-6_41
– ident: ref24/cit24
  doi: 10.1021/acs.molpharmaceut.8b00839
– ident: ref52/cit52
  doi: 10.1021/acs.jcim.8b00234
– ident: ref15/cit15
  doi: 10.1126/sciadv.aap7885
– ident: ref25/cit25
  doi: 10.1021/acs.jcim.9b00694
– volume: 27
  start-page: 2672
  year: 2014
  ident: ref3/cit3
  publication-title: Advances in neural information processing systems
– ident: ref34/cit34
  doi: 10.1021/acs.jcim.7b00690
– ident: ref23/cit23
  doi: 10.1021/acs.molpharmaceut.7b01137
– ident: ref50/cit50
  doi: 10.1186/1758-2946-1-8
– ident: ref1/cit1
  doi: 10.1007/s10822-013-9672-4
– ident: ref54/cit54
  doi: 10.1021/js9804011
– ident: ref45/cit45
– ident: ref38/cit38
  doi: 10.1038/s41524-018-0128-1
– ident: ref16/cit16
  doi: 10.1186/s13321-017-0235-x
– ident: ref30/cit30
  doi: 10.1186/s13321-019-0397-9
– ident: ref26/cit26
  doi: 10.1186/s13321-019-0393-0
– volume-title: RDKit: A Software Suite for Cheminformatics, Computational Chemistry, and Predictive Modeling
  year: 2013
  ident: ref48/cit48
– ident: ref12/cit12
  doi: 10.1609/aaai.v34i01.5433
– ident: ref49/cit49
  doi: 10.1021/jm9602928
– ident: ref13/cit13
  doi: 10.1021/acscentsci.7b00512
– ident: ref14/cit14
  doi: 10.1002/minf.201700111
– start-page: arXiv:1802.0436
  year: 2018
  ident: ref20/cit20
  publication-title: arXiv preprint
– ident: ref36/cit36
  doi: 10.1039/D1SC02783B
– start-page: arXiv:1810.0480
  year: 2018
  ident: ref6/cit6
  publication-title: arXiv preprint
– ident: ref11/cit11
  doi: 10.1021/ci00057a005
SSID ssj0033962
Score 2.716186
SecondaryResourceType review_article
Snippet Application of deep learning techniques for generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The...
Application of deep learning techniques for de novo generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug...
SourceID proquest
pubmed
crossref
SourceType Aggregation Database
Index Database
Enrichment Source
StartPage 2064
SubjectTerms Deep learning
Drug Design
Machine Learning
Natural language processing
Property values
Scaffolds
Strings
Transformers
Title MolGPT: Molecular Generation Using a Transformer-Decoder Model
URI https://www.ncbi.nlm.nih.gov/pubmed/34694798
https://www.proquest.com/docview/2666964301
https://www.proquest.com/docview/2586452113
Volume 62
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwELVQL3BhXwoFBYkLh7RxvMTmgFSVpUIqQqKVeotsJ5aA0qI2vfD1eLIUsQrOcRxrxpN59tjvIXTCRZAmnFhfKNi6AbIhQZSLK-7ShXZ4hFrYGujd8u6A3gzZ8J0m53MFP8QtZWbNR_Pw3MQGyEPy5XlEgSa_3bmvfrqEyFw7FAjHfMlkVZH8roOPGegHWJmnl6u1QqdolrMSwqmSp-Y8003z-pWz8Q8jX0erJcr02sW02EBL6XgTLXcqcbctdN6bjK7v-mder5LH9QoCavCTl58j8JTXr1BtOvUvUrj9PvVAPG20jQZXl_1O1y-lFHxDwijzhcCYcxtRqaVKBFWMU6WT0AbMpkSaxNlMaautAsgdYuuAWeIWxyQSTAUJIzuoNp6M0z3k6YhqSZgRgXQBjxPNhWU4SpTEBris6qhVmTc2Jc84yF2M4rzeHeLY2SUGu8SlXerodPHGS8Gx8UvbRuWxuIy2WexABgdesQDX0fHisbMoFD_UOJ3MXRsmoIaLMamj3cLTi48RyiWNpNj_x0AO0EoINyHg7KNsoFo2naeHDp9k-iifmW-FWt4h
linkProvider American Chemical Society
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MolGPT%3A+Molecular+Generation+Using+a+Transformer-Decoder+Model&rft.jtitle=Journal+of+chemical+information+and+modeling&rft.au=Bagal%2C+Viraj&rft.au=Aggarwal%2C+Rishal&rft.au=Vinod%2C+P+K&rft.au=Priyakumar%2C+U+Deva&rft.date=2022-05-09&rft.pub=American+Chemical+Society&rft.issn=1549-9596&rft.eissn=1549-960X&rft.volume=62&rft.issue=9&rft.spage=2064&rft_id=info:doi/10.1021%2Facs.jcim.1c00600&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1549-9596&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1549-9596&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1549-9596&client=summon