Transformers in medical image segmentation: a narrative review

Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging do...

Full description

Saved in:
Bibliographic Details
Published inQuantitative imaging in medicine and surgery Vol. 13; no. 12; pp. 8747 - 8767
Main Authors Khan, Rabeea Fatma, Lee, Byoung-Dai, Lee, Mu Sook
Format Journal Article
LanguageEnglish
Published China 01.12.2023
Subjects
Online AccessGet full text

Cover

Loading…
Abstract Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner. Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered. In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced. We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.
AbstractList Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner. Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered. In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced. We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.
Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner.Background and ObjectiveTransformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner.Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered.MethodsDatabases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered.In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced.Key Content and FindingsIn this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced.We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.ConclusionsWe have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.
Author Khan, Rabeea Fatma
Lee, Byoung-Dai
Lee, Mu Sook
Author_xml – sequence: 1
  givenname: Rabeea Fatma
  surname: Khan
  fullname: Khan, Rabeea Fatma
– sequence: 2
  givenname: Byoung-Dai
  surname: Lee
  fullname: Lee, Byoung-Dai
– sequence: 3
  givenname: Mu Sook
  surname: Lee
  fullname: Lee, Mu Sook
BackLink https://www.ncbi.nlm.nih.gov/pubmed/38106306$$D View this record in MEDLINE/PubMed
BookMark eNptkL1PwzAQxS1UREvpxowyMhA423WcMCBVFV9SJZYyW058qYwSp7XTIv57DG0XxC13J_3ek947JwPXOSTkksIto8Dl3ca2IWU8FVN2QkaMxXPKIRscb1awIZmE8AFxZE4lhTMy5DmFLGIj8rD02oW68y36kFiXtGhspZvEtnqFScBVi67Xve3cfaITp72Pzw4TjzuLnxfktNZNwMlhj8n70-Ny_pIu3p5f57NFWnFW9GnNsRSZzlhNjTFUGg4IphSiNgClMDUDKXM0pckLAVqCoJzqAio9RVPRnI_J9d537bvNFkOvWhsqbBrtsNsGxQrgnMlCFBG9OqDbMoZRax-j-C91zBwBtgcq34XgsVaV3SfsvbaNoqB-u1U_3SrGVew2im7-iI6-_-Lfmv97Bg
CitedBy_id crossref_primary_10_1002_mp_17509
crossref_primary_10_7717_peerj_cs_2506
crossref_primary_10_1007_s12672_025_01896_7
crossref_primary_10_1007_s40290_024_00515_0
crossref_primary_10_2196_57723
crossref_primary_10_1007_s00521_024_10956_y
crossref_primary_10_1016_j_nima_2025_170306
crossref_primary_10_1016_j_bspc_2025_107510
ContentType Journal Article
Copyright 2023 Quantitative Imaging in Medicine and Surgery. All rights reserved.
Copyright_xml – notice: 2023 Quantitative Imaging in Medicine and Surgery. All rights reserved.
DBID AAYXX
CITATION
NPM
7X8
DOI 10.21037/qims-23-542
DatabaseName CrossRef
PubMed
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
MEDLINE - Academic
DatabaseTitleList PubMed
MEDLINE - Academic
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Medicine
EISSN 2223-4306
EndPage 8767
ExternalDocumentID 38106306
10_21037_qims_23_542
Genre Journal Article
Review
GroupedDBID 53G
AAKDD
AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
DIK
HYE
OK1
RPM
NPM
7X8
ID FETCH-LOGICAL-c329t-f3eb56a62f1ddd17d30e0db55fd00b5df20778edbd8950a705131a90ca4edc183
ISSN 2223-4292
IngestDate Fri Jul 11 02:21:38 EDT 2025
Thu Apr 03 07:01:57 EDT 2025
Thu Apr 24 23:05:17 EDT 2025
Tue Jul 01 02:30:45 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Keywords deep learning
Transformers
image segmentation
artificial intelligence (AI)
medical imaging
Language English
License 2023 Quantitative Imaging in Medicine and Surgery. All rights reserved.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c329t-f3eb56a62f1ddd17d30e0db55fd00b5df20778edbd8950a705131a90ca4edc183
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
ObjectType-Review-3
content type line 23
OpenAccessLink https://qims.amegroups.org/article/viewFile/117952/pdf
PMID 38106306
PQID 2903327959
PQPubID 23479
PageCount 21
ParticipantIDs proquest_miscellaneous_2903327959
pubmed_primary_38106306
crossref_citationtrail_10_21037_qims_23_542
crossref_primary_10_21037_qims_23_542
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2023-12-01
PublicationDateYYYYMMDD 2023-12-01
PublicationDate_xml – month: 12
  year: 2023
  text: 2023-12-01
  day: 01
PublicationDecade 2020
PublicationPlace China
PublicationPlace_xml – name: China
PublicationTitle Quantitative imaging in medicine and surgery
PublicationTitleAlternate Quant Imaging Med Surg
PublicationYear 2023
SSID ssj0000781710
Score 2.4096806
SecondaryResourceType review_article
Snippet Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value...
SourceID proquest
pubmed
crossref
SourceType Aggregation Database
Index Database
Enrichment Source
StartPage 8747
Title Transformers in medical image segmentation: a narrative review
URI https://www.ncbi.nlm.nih.gov/pubmed/38106306
https://www.proquest.com/docview/2903327959
Volume 13
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELZgkRAXxJvykpHgVBlcTxw3HJB4rVZI5cKutLfIjp2liGahTQ_w6xk_8thqV1q4RJUdO6ln9M0442-GkBcKlFaqNqzOTcYQJQUz6MMxcADWl7TOwHOHF1_yg6Ps87E8HiL4gV3SmlfVn3N5Jf8jVWxDuXqW7D9Itp8UG_A3yhevKGG8Xk7GndfpWbjLJgbKfRaNlT-Js3Enq8QsaiKnudHrlOh7PYQEvvfJNJpAOPPdfoJEdumC7yHKsBmRqD1Kf0tRJG2c09N93Q4on074vP_t4YR91MudjsV2-rVz8NNXBwGjExwBnLxbwXypq2hHRm3A8zPoCmMtEiOsnKuYa7O3u7Euxy6mi5QW4NdytWHCn-QQg-3q4vU7Jq0_aIhbnDC-9KNLASWOvkquCdxTwOjTTjDbaj5TIXtF_98iUSJM8Hr0-LMuzAX7kuCfHN4iN9PGgr6LWnKbXHHNHXJ9kaR3l7wdKwtdNjQpCw3KQsfK8oZq2qsKjapyjxztfzr8cMBS8QxWgShaVoMzMte5qGfW2pmywB23Rsracm6krQVXau6ssfNCcq0QnGGmC17pzNkKgf4-2WtOG_eQUCsym2vpM8vJzBVQVBnahZwLV9dQS5iQabceZZUyy_sCJz_K85Z_Ql72d_-MGVUuuO95t7QlQp6PY-nGnW6xv-AAQhWymJAHcc37mXzCuhxf7tEln_KY3BjU-wnZa9db9xTdzNY8C-rxFwT2fOU
linkProvider National Library of Medicine
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Transformers+in+medical+image+segmentation%3A+a+narrative+review&rft.jtitle=Quantitative+imaging+in+medicine+and+surgery&rft.au=Khan%2C+Rabeea+Fatma&rft.au=Lee%2C+Byoung-Dai&rft.au=Lee%2C+Mu+Sook&rft.date=2023-12-01&rft.issn=2223-4292&rft.eissn=2223-4306&rft.volume=13&rft.issue=12&rft.spage=8747&rft.epage=8767&rft_id=info:doi/10.21037%2Fqims-23-542&rft.externalDBID=n%2Fa&rft.externalDocID=10_21037_qims_23_542
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2223-4292&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2223-4292&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2223-4292&client=summon