Transformers in medical image segmentation: a narrative review

Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging do...

Full description

Saved in:

Bibliographic Details
Published in	Quantitative imaging in medicine and surgery Vol. 13; no. 12; pp. 8747 - 8767
Main Authors	Khan, Rabeea Fatma, Lee, Byoung-Dai, Lee, Mu Sook
Format	Journal Article
Language	English
Published	China 01.12.2023
Subjects	deep learning Transformers image segmentation artificial intelligence (AI) medical imaging
Online Access	Get full text

Cover

Loading…

Abstract	Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner. Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered. In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced. We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.
AbstractList	Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner. Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered. In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced. We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation. Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner.Background and ObjectiveTransformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value in computer vision tasks. With this increasing popularity, they have also been extensively researched in the more complex medical imaging domain. The associated developments have resulted in transformers being on par with sought-after convolution neural networks, particularly for medical image segmentation. Methods combining both types of networks have proven to be especially successful in capturing local and global contexts, thereby significantly boosting their performances in various segmentation problems. Motivated by this success, we have attempted to survey the consequential research focused on innovative transformer networks, specifically those designed to cater to medical image segmentation in an efficient manner.Databases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered.MethodsDatabases like Google Scholar, arxiv, ResearchGate, Microsoft Academic, and Semantic Scholar have been utilized to find recent developments in this field. Specifically, research in the English language from 2021 to 2023 was considered.In this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced.Key Content and FindingsIn this survey, we look into the different types of architectures and attention mechanisms that uniquely improve performance and the structures that are in place to handle complex medical data. Through this survey, we summarize the popular and unconventional transformer-based research as seen through different key angles and analyze quantitatively the strategies that have proven more advanced.We have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.ConclusionsWe have also attempted to discern existing gaps and challenges within current research, notably highlighting the deficiency of annotated medical data for precise deep learning model training. Furthermore, potential future directions for enhancing transformers' utility in healthcare are outlined, encompassing strategies such as transfer learning and exploiting foundation models for specialized medical image segmentation.
Author	Khan, Rabeea Fatma Lee, Byoung-Dai Lee, Mu Sook
Author_xml	– sequence: 1 givenname: Rabeea Fatma surname: Khan fullname: Khan, Rabeea Fatma – sequence: 2 givenname: Byoung-Dai surname: Lee fullname: Lee, Byoung-Dai – sequence: 3 givenname: Mu Sook surname: Lee fullname: Lee, Mu Sook
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/38106306$$D View this record in MEDLINE/PubMed
BookMark	eNptkL1PwzAQxS1UREvpxowyMhA423WcMCBVFV9SJZYyW058qYwSp7XTIv57DG0XxC13J_3ek947JwPXOSTkksIto8Dl3ca2IWU8FVN2QkaMxXPKIRscb1awIZmE8AFxZE4lhTMy5DmFLGIj8rD02oW68y36kFiXtGhspZvEtnqFScBVi67Xve3cfaITp72Pzw4TjzuLnxfktNZNwMlhj8n70-Ny_pIu3p5f57NFWnFW9GnNsRSZzlhNjTFUGg4IphSiNgClMDUDKXM0pckLAVqCoJzqAio9RVPRnI_J9d537bvNFkOvWhsqbBrtsNsGxQrgnMlCFBG9OqDbMoZRax-j-C91zBwBtgcq34XgsVaV3SfsvbaNoqB-u1U_3SrGVew2im7-iI6-_-Lfmv97Bg
CitedBy_id	crossref_primary_10_1002_mp_17509 crossref_primary_10_7717_peerj_cs_2506 crossref_primary_10_1007_s12672_025_01896_7 crossref_primary_10_1007_s40290_024_00515_0 crossref_primary_10_2196_57723 crossref_primary_10_1007_s00521_024_10956_y crossref_primary_10_1016_j_nima_2025_170306 crossref_primary_10_1016_j_bspc_2025_107510
ContentType	Journal Article
Copyright	2023 Quantitative Imaging in Medicine and Surgery. All rights reserved.
Copyright_xml	– notice: 2023 Quantitative Imaging in Medicine and Surgery. All rights reserved.
DBID	AAYXX CITATION NPM 7X8
DOI	10.21037/qims-23-542
DatabaseName	CrossRef PubMed MEDLINE - Academic
DatabaseTitle	CrossRef PubMed MEDLINE - Academic
DatabaseTitleList	PubMed MEDLINE - Academic
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Medicine
EISSN	2223-4306
EndPage	8767
ExternalDocumentID	38106306 10_21037_qims_23_542
Genre	Journal Article Review
GroupedDBID	53G AAKDD AAYXX ALMA_UNASSIGNED_HOLDINGS CITATION DIK HYE OK1 RPM NPM 7X8
ID	FETCH-LOGICAL-c329t-f3eb56a62f1ddd17d30e0db55fd00b5df20778edbd8950a705131a90ca4edc183
ISSN	2223-4292
IngestDate	Fri Jul 11 02:21:38 EDT 2025 Thu Apr 03 07:01:57 EDT 2025 Thu Apr 24 23:05:17 EDT 2025 Tue Jul 01 02:30:45 EDT 2025
IsDoiOpenAccess	false
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	12
Keywords	deep learning Transformers image segmentation artificial intelligence (AI) medical imaging
Language	English
License	2023 Quantitative Imaging in Medicine and Surgery. All rights reserved.
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c329t-f3eb56a62f1ddd17d30e0db55fd00b5df20778edbd8950a705131a90ca4edc183
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 ObjectType-Review-3 content type line 23
OpenAccessLink	https://qims.amegroups.org/article/viewFile/117952/pdf
PMID	38106306
PQID	2903327959
PQPubID	23479
PageCount	21
ParticipantIDs	proquest_miscellaneous_2903327959 pubmed_primary_38106306 crossref_citationtrail_10_21037_qims_23_542 crossref_primary_10_21037_qims_23_542
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2023-12-01
PublicationDateYYYYMMDD	2023-12-01
PublicationDate_xml	– month: 12 year: 2023 text: 2023-12-01 day: 01
PublicationDecade	2020
PublicationPlace	China
PublicationPlace_xml	– name: China
PublicationTitle	Quantitative imaging in medicine and surgery
PublicationTitleAlternate	Quant Imaging Med Surg
PublicationYear	2023
SSID	ssj0000781710
Score	2.4096806
SecondaryResourceType	review_article
Snippet	Transformers, which have been widely recognized as state-of-the-art tools in natural language processing (NLP), have also come to be recognized for their value...
SourceID	proquest pubmed crossref
SourceType	Aggregation Database Index Database Enrichment Source
StartPage	8747
Title	Transformers in medical image segmentation: a narrative review
URI	https://www.ncbi.nlm.nih.gov/pubmed/38106306 https://www.proquest.com/docview/2903327959
Volume	13
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Lj9MwELZgkRAXxJvykpHgVBlcTxw3HJB4rVZI5cKutLfIjp2liGahTQ_w6xk_8thqV1q4RJUdO6ln9M0442-GkBcKlFaqNqzOTcYQJQUz6MMxcADWl7TOwHOHF1_yg6Ps87E8HiL4gV3SmlfVn3N5Jf8jVWxDuXqW7D9Itp8UG_A3yhevKGG8Xk7GndfpWbjLJgbKfRaNlT-Js3Enq8QsaiKnudHrlOh7PYQEvvfJNJpAOPPdfoJEdumC7yHKsBmRqD1Kf0tRJG2c09N93Q4on074vP_t4YR91MudjsV2-rVz8NNXBwGjExwBnLxbwXypq2hHRm3A8zPoCmMtEiOsnKuYa7O3u7Euxy6mi5QW4NdytWHCn-QQg-3q4vU7Jq0_aIhbnDC-9KNLASWOvkquCdxTwOjTTjDbaj5TIXtF_98iUSJM8Hr0-LMuzAX7kuCfHN4iN9PGgr6LWnKbXHHNHXJ9kaR3l7wdKwtdNjQpCw3KQsfK8oZq2qsKjapyjxztfzr8cMBS8QxWgShaVoMzMte5qGfW2pmywB23Rsracm6krQVXau6ssfNCcq0QnGGmC17pzNkKgf4-2WtOG_eQUCsym2vpM8vJzBVQVBnahZwLV9dQS5iQabceZZUyy_sCJz_K85Z_Ql72d_-MGVUuuO95t7QlQp6PY-nGnW6xv-AAQhWymJAHcc37mXzCuhxf7tEln_KY3BjU-wnZa9db9xTdzNY8C-rxFwT2fOU
linkProvider	National Library of Medicine
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Transformers+in+medical+image+segmentation%3A+a+narrative+review&rft.jtitle=Quantitative+imaging+in+medicine+and+surgery&rft.au=Khan%2C+Rabeea+Fatma&rft.au=Lee%2C+Byoung-Dai&rft.au=Lee%2C+Mu+Sook&rft.date=2023-12-01&rft.issn=2223-4292&rft.eissn=2223-4306&rft.volume=13&rft.issue=12&rft.spage=8747&rft.epage=8767&rft_id=info:doi/10.21037%2Fqims-23-542&rft.externalDBID=n%2Fa&rft.externalDocID=10_21037_qims_23_542
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2223-4292&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2223-4292&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2223-4292&client=summon