HiFuse: Hierarchical multi-scale feature fusion network for medical image classification

Effective fusion of global and local multi-scale features is crucial for medical image classification. Medical images have many noisy, scattered features, intra-class variations, and inter-class similarities. Many studies have shown that global and local features are helpful to reduce noise interfer...

Full description

Saved in:

Bibliographic Details
Published in	Biomedical signal processing and control Vol. 87; p. 105534
Main Authors	Huo, Xiangzuo, Sun, Gang, Tian, Shengwei, Wang, Yan, Yu, Long, Long, Jun, Zhang, Wendong, Li, Aolun
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.01.2024
Subjects	Feature fusion Hybrid network Medical image classification Multi-scale feature Swin-Transformer Feature fusion Hybrid network Medical image classification Swin-Transformer Multi-scale feature
Online Access	Get full text

Cover

Loading…

Abstract	Effective fusion of global and local multi-scale features is crucial for medical image classification. Medical images have many noisy, scattered features, intra-class variations, and inter-class similarities. Many studies have shown that global and local features are helpful to reduce noise interference in medical images. It is difficult to capture the global features of images due to the fixed size of the receptive domain of the convolution kernel. Although the self-attention-based Transformer can model long-range dependencies, it has high computational complexity and lacks local inductive bias. In this paper, we propose a three-branch hierarchical multi-scale feature fusion network structure termed as HiFuse, which can fuse multi-scale global and local features without destroying the respective modeling, thus improving the classification accuracy of various medical images. There are two key characteristics: (i) a parallel hierarchical structure consisting of global and local feature blocks; (ii) an adaptive hierarchical feature fusion block (HFF block) and inverted residual multi-layer perceptron(IRMLP). The advantage of this network structure lies in that the resulting representation is semantically richer and the local features and global representations can be effectively extracted at different semantic scales. Our proposed model’s ACC and F1 values reached 85.85% and 75.32% on the ISIC2018 dataset, 86.12% and 86.13% on the Kvasir dataset, 76.88% and 76.31% on the Covid-19 dataset, 92.31% and 88.81% on the esophageal cancer pathology dataset. The HiFuse model performs the best compared to other advanced models. Our code is open source and available from https://github.com/huoxiangzuo/HiFuse. •Proposed a novel three-branch hierarchical multi-scale feature fusion network structure.•Design of global and local feature blocks with CNN and self-attention, and their fusion by HFF block.•Effective fusion of global and local multi-scale features is crucial for medical image classification.•Validated the performance with four medical image datasets.•This work can contribute to various downstream tasks in medical imagery.
AbstractList	Effective fusion of global and local multi-scale features is crucial for medical image classification. Medical images have many noisy, scattered features, intra-class variations, and inter-class similarities. Many studies have shown that global and local features are helpful to reduce noise interference in medical images. It is difficult to capture the global features of images due to the fixed size of the receptive domain of the convolution kernel. Although the self-attention-based Transformer can model long-range dependencies, it has high computational complexity and lacks local inductive bias. In this paper, we propose a three-branch hierarchical multi-scale feature fusion network structure termed as HiFuse, which can fuse multi-scale global and local features without destroying the respective modeling, thus improving the classification accuracy of various medical images. There are two key characteristics: (i) a parallel hierarchical structure consisting of global and local feature blocks; (ii) an adaptive hierarchical feature fusion block (HFF block) and inverted residual multi-layer perceptron(IRMLP). The advantage of this network structure lies in that the resulting representation is semantically richer and the local features and global representations can be effectively extracted at different semantic scales. Our proposed model’s ACC and F1 values reached 85.85% and 75.32% on the ISIC2018 dataset, 86.12% and 86.13% on the Kvasir dataset, 76.88% and 76.31% on the Covid-19 dataset, 92.31% and 88.81% on the esophageal cancer pathology dataset. The HiFuse model performs the best compared to other advanced models. Our code is open source and available from https://github.com/huoxiangzuo/HiFuse. •Proposed a novel three-branch hierarchical multi-scale feature fusion network structure.•Design of global and local feature blocks with CNN and self-attention, and their fusion by HFF block.•Effective fusion of global and local multi-scale features is crucial for medical image classification.•Validated the performance with four medical image datasets.•This work can contribute to various downstream tasks in medical imagery.
ArticleNumber	105534
Author	Yu, Long Sun, Gang Zhang, Wendong Li, Aolun Huo, Xiangzuo Long, Jun Tian, Shengwei Wang, Yan
Author_xml	– sequence: 1 givenname: Xiangzuo orcidid: 0000-0001-5691-4432 surname: Huo fullname: Huo, Xiangzuo email: huoxiangzuo@163.com organization: School of Information Science and Engineering, Xinjiang University, Urumqi, 830000, Xinjiang, China – sequence: 2 givenname: Gang surname: Sun fullname: Sun, Gang email: sung853219@163.com organization: Department of Breast and Thyroid Surgery, The Affiliated Tumour Hospital of Xinjiang Medical University, Urumqi, 830011, Xinjiang, China – sequence: 3 givenname: Shengwei orcidid: 0000-0003-3525-5102 surname: Tian fullname: Tian, Shengwei email: tianshengwei@163.com organization: School of Information Science and Engineering, Xinjiang University, Urumqi, 830000, Xinjiang, China – sequence: 4 givenname: Yan surname: Wang fullname: Wang, Yan email: xjwangyan2012@163.com organization: Department of Breast and Thyroid Surgery, The Affiliated Tumour Hospital of Xinjiang Medical University, Urumqi, 830011, Xinjiang, China – sequence: 5 givenname: Long surname: Yu fullname: Yu, Long organization: School of Information Science and Engineering, Xinjiang University, Urumqi, 830000, Xinjiang, China – sequence: 6 givenname: Jun surname: Long fullname: Long, Jun organization: Big Data Institute, Central South University, Changsha, 410083, Hunan, China – sequence: 7 givenname: Wendong surname: Zhang fullname: Zhang, Wendong organization: School of Information Science and Engineering, Xinjiang University, Urumqi, 830000, Xinjiang, China – sequence: 8 givenname: Aolun orcidid: 0000-0003-4439-4331 surname: Li fullname: Li, Aolun organization: School of Information Science and Engineering, Xinjiang University, Urumqi, 830000, Xinjiang, China
BookMark	eNp9kMFOwzAMhiM0JLbBC3DqC3TYbdZmiAuaGEOaxAUkblGaOpDRtVOSgnj7ZRtcOOxky_o_y_5GbNB2LTF2jTBBwOJmPan8Vk8yyPI4mE5zfsaGWPIiFQhi8NfDjF-wkfdrAC5K5EP2trSL3tNtsrTklNMfVqsm2fRNsKmPLSWGVOhdrL23XZu0FL4795mYziUbqg9xu1HvlOhGeW9NnIQYvGTnRjWern7rmL0uHl7my3T1_Pg0v1-lOgcIqcnKWhkwGZ9W2tRFVSFiTgBCGF4VRFrUyBWKepYDhxx1huUMCLEgNJTlY5Yd92rXee_IyK2L97gfiSD3buRa7t3IvRt5dBMh8Q_SNhzODk7Z5jR6d0QpPvUVpUmvLbU6qnCkg6w7ewrfAQ8lguQ
CitedBy_id	crossref_primary_10_1016_j_ins_2024_120715 crossref_primary_10_1371_journal_pone_0301441 crossref_primary_10_1016_j_eswa_2024_123334 crossref_primary_10_1109_ACCESS_2025_3531001 crossref_primary_10_1016_j_bspc_2025_107702 crossref_primary_10_1007_s00429_024_02889_y crossref_primary_10_1016_j_procs_2024_09_388 crossref_primary_10_1007_s11042_024_19837_x crossref_primary_10_32604_cmc_2024_052060 crossref_primary_10_1007_s10278_025_01466_x crossref_primary_10_3390_app14188446 crossref_primary_10_3390_rs16193562 crossref_primary_10_1088_2057_1976_ad9f68 crossref_primary_10_1109_JAS_2023_124167 crossref_primary_10_1109_TMI_2024_3436080 crossref_primary_10_2174_0118749445335423240808062700 crossref_primary_10_1016_j_cmpb_2023_107958 crossref_primary_10_1016_j_engappai_2024_109469 crossref_primary_10_1016_j_jrras_2024_101210 crossref_primary_10_1016_j_neucom_2024_128749 crossref_primary_10_1186_s12880_024_01486_z crossref_primary_10_1038_s41598_025_93143_w crossref_primary_10_3390_app14104278 crossref_primary_10_1016_j_engappai_2024_108386 crossref_primary_10_1109_TMI_2024_3381994 crossref_primary_10_1016_j_media_2025_103515 crossref_primary_10_1016_j_bspc_2025_107727 crossref_primary_10_1109_TIM_2024_3406810 crossref_primary_10_26599_BDMA_2024_9020057 crossref_primary_10_1007_s11042_024_18848_y crossref_primary_10_1016_j_patcog_2025_111371 crossref_primary_10_1088_2057_1976_adb494 crossref_primary_10_1007_s11042_024_18985_4 crossref_primary_10_1007_s12530_024_09647_9 crossref_primary_10_1007_s10489_025_06408_2 crossref_primary_10_1016_j_eswa_2025_127145 crossref_primary_10_1016_j_inffus_2025_102937 crossref_primary_10_1109_TFUZZ_2024_3410929 crossref_primary_10_1016_j_compbiomed_2024_108202 crossref_primary_10_1002_ima_23202 crossref_primary_10_1038_s41598_024_84949_1 crossref_primary_10_3390_s24237710 crossref_primary_10_1007_s12539_025_00693_8 crossref_primary_10_1109_ACCESS_2024_3460830 crossref_primary_10_1016_j_neunet_2025_107209 crossref_primary_10_1088_1361_6560_ad869f crossref_primary_10_3390_rs16132323 crossref_primary_10_1007_s40747_024_01708_5 crossref_primary_10_3390_app142411611 crossref_primary_10_3390_electronics13245002 crossref_primary_10_4236_jcc_2024_122011 crossref_primary_10_1109_ACCESS_2024_3433374 crossref_primary_10_1109_ACCESS_2024_3397197 crossref_primary_10_1016_j_neucom_2024_129182 crossref_primary_10_1038_s41598_025_90440_2 crossref_primary_10_32604_cmc_2024_058647 crossref_primary_10_1007_s40998_024_00788_w crossref_primary_10_1016_j_eswa_2024_125964 crossref_primary_10_1063_5_0242894 crossref_primary_10_1016_j_jrras_2025_101332 crossref_primary_10_32604_cmc_2024_050767 crossref_primary_10_1007_s10489_024_05900_5 crossref_primary_10_1109_TGRS_2024_3369083 crossref_primary_10_1016_j_cmpb_2024_108112 crossref_primary_10_1109_TMI_2024_3459910 crossref_primary_10_1002_jbio_202400233 crossref_primary_10_1016_j_asoc_2024_112500 crossref_primary_10_3390_app14072900 crossref_primary_10_1007_s11760_024_03477_7 crossref_primary_10_1016_j_advengsoft_2024_103719 crossref_primary_10_1109_ACCESS_2025_3538504 crossref_primary_10_1007_s00371_025_03837_5
Cites_doi	10.1109/TMI.2013.2241448 10.1109/CVPR52688.2022.01167 10.1109/TIP.2006.888348 10.1109/CVPR.2018.00745 10.1109/ICCV48922.2021.00465 10.1109/ICCV.2017.89 10.1145/3083187.3083212 10.1038/nature22985 10.1109/ICCV48922.2021.00986 10.1016/j.patcog.2016.05.029 10.1109/CVPR46437.2021.00681 10.1109/ICCV48922.2021.00042 10.1016/j.compbiomed.2023.107038 10.1109/CVPR.2016.90 10.1109/JPROC.2021.3054390 10.3390/rs13030498 10.1016/j.patcog.2022.108827 10.1109/ICCV48922.2021.00060 10.1109/JSTARS.2021.3119654 10.1109/CVPR.2017.195 10.1109/CVPR52688.2022.01186 10.3390/drones7050287 10.1109/TPAMI.2022.3152247 10.1145/3505244 10.3390/info8030091 10.1109/CVPR52729.2023.00995 10.1109/ICCV.2017.74 10.3390/ijerph18063056 10.1109/CVPR46437.2021.01212 10.1109/ICCV48922.2021.00061 10.1109/JBHI.2016.2635663 10.1016/j.cmpb.2022.106924 10.1103/PhysRevA.34.4217 10.1109/TMI.2016.2529665 10.3390/s22114061 10.1007/978-3-030-01234-2_1 10.1109/CVPR.2017.667 10.1109/CVPR.2017.634 10.1016/j.media.2021.102313
ContentType	Journal Article
Copyright	2023 Elsevier Ltd
Copyright_xml	– notice: 2023 Elsevier Ltd
DBID	AAYXX CITATION
DOI	10.1016/j.bspc.2023.105534
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1746-8108
ExternalDocumentID	10_1016_j_bspc_2023_105534 S1746809423009679
GroupedDBID	--- --K --M .~1 0R~ 1B1 1~. 1~5 23N 4.4 457 4G. 5GY 5VS 6J9 7-5 71M 8P~ AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXUO AAYFN ABBOA ABFNM ABFRF ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADMUD ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HZ~ IHE J1W JJJVA KOM M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 RIG ROL RPZ SDF SDG SES SPC SPCBC SST SSV SSZ T5K UNMZH ~G- AATTM AAXKI AAYWO AAYXX ABWVN ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AFXIZ AGCQF AGRNS AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP BNPGV CITATION SSH
ID	FETCH-LOGICAL-c300t-f27daf0f245bcfd6bb1113e0088f4b6eec8d14a18d9304031c21790e116e1fe23
IEDL.DBID	.~1
ISSN	1746-8094
IngestDate	Thu Apr 24 22:58:59 EDT 2025 Tue Jul 01 01:34:20 EDT 2025 Fri Feb 23 02:33:54 EST 2024
IsPeerReviewed	true
IsScholarly	true
Keywords	Feature fusion Hybrid network Medical image classification Swin-Transformer Multi-scale feature
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c300t-f27daf0f245bcfd6bb1113e0088f4b6eec8d14a18d9304031c21790e116e1fe23
ORCID	0000-0001-5691-4432 0000-0003-3525-5102 0000-0003-4439-4331
ParticipantIDs	crossref_primary_10_1016_j_bspc_2023_105534 crossref_citationtrail_10_1016_j_bspc_2023_105534 elsevier_sciencedirect_doi_10_1016_j_bspc_2023_105534
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	January 2024 2024-01-00
PublicationDateYYYYMMDD	2024-01-01
PublicationDate_xml	– month: 01 year: 2024 text: January 2024
PublicationDecade	2020
PublicationTitle	Biomedical signal processing and control
PublicationYear	2024
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Beal, Kim, Tzeng, Park, Zhai, Kislyuk (b31) 2020 Baloch, Krim (b19) 2007; 16 Koitka, Friedrich (b1) 2016 H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, W. Gao, Pre-trained image processing transformer, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12299–12310. Khan, Naseer, Hayat, Zamir, Khan, Shah (b26) 2022; 54 Valanarasu, Oza, Hacihaliloglu, Patel (b36) 2021 K. Pogorelov, K.R. Randel, C. Griwodz, S.L. Eskeland, T. de Lange, D. Johansen, C. Spampinato, D.-T. Dang-Nguyen, M. Lux, P.T. Schmidt, et al., Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection, in: Proceedings of the 8th ACM on Multimedia Systems Conference, 2017, pp. 164–169. Wang, Chen, Ding, Yu, Zha, Li (b39) 2021 Yang, Li, Dai, Gao (b64) 2022; 35 Personnaz, Guyon, Dreyfus (b5) 1986; 34 Zhu, Su, Lu, Li, Wang, Dai (b30) 2020 S. Xie, R. Girshick, P. Dollár, Z. Tu, K. He, Aggregated residual transformations for deep neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1492–1500. Song, Cai, Zhou, Feng (b20) 2013; 32 Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin (b8) 2017; 30 J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, Y. Wei, Deformable convolutional networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 764–773. Carion, Massa, Synnaeve, Usunier, Kirillov, Zagoruyko (b29) 2020 Vahadane, Peng, Sethi, Albarqouni, Wang, Baust, Steiger, Schlitter, Esposito, Navab (b59) 2016; 35 Zhang, Liu, Hu (b13) 2021 L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z.-H. Jiang, F.E. Tay, J. Feng, S. Yan, Tokens-to-token vit: Training vision transformers from scratch on imagenet, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 558–567. Dosovitskiy, Beyer, Kolesnikov, Weissenborn, Zhai, Unterthiner, Dehghani, Minderer, Heigold, Gelly (b9) 2020 Cheng, Tian, Yu, Gao, Kang, Ma, Wu, Liu, Lu (b23) 2022; 76 Han, Wang, Chen, Chen, Guo, Liu, Tang, Xiao, Xu, Xu (b24) 2022; 45 H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, H. Jegou, Training data-efficient image transformers I& amp; distillation through attention, in: International Conference on Machine Learning, Vol. 139, 2021, pp. 10347–10357. Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, A convnet for the 2020s, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976–11986. Chen, Li, Wang, Li, Rahaman, Sun, Hu, Li, Liu, Sun (b40) 2022; 130 Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022. J. Guo, K. Han, H. Wu, Y. Tang, X. Chen, Y. Wang, C. Xu, Cmt: Convolutional neural networks meet vision transformers, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12175–12185. Wu, Xu, Dai, Wan, Zhang, Yan, Tomizuka, Gonzalez, Keutzer, Vajda (b27) 2020 Shen, Zhou, Yang, Yu, Dong, Yang, Zang, Tian (b3) 2017; 61 Codella, Rotemberg, Tschandl, Celebi, Dusza, Gutman, Helba, Kalloo, Liopyris, Marchetti (b56) 2019 Esteva, Kuprel, Novoa, Ko, Swetter, Blau, Thrun (b4) 2017; 546 F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1251–1258. Zhou, Greenspan, Davatzikos, Duncan, Van Ginneken, Madabhushi, Prince, Rueckert, Summers (b69) 2021; 109 Fu, Zhang, He, Cao, Guo, Wang (b12) 2022 Mnih, Heess, Graves (b49) 2014; 27 Jamil, Jalil Piran, Kwon (b25) 2023; 7 Irfan, Iftikhar, Yasin, Draz, Ali, Hussain, Bukhari, Alwadie, Rahman, Glowacz (b21) 2021; 18 L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, T.-S. Chua, Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5659–5667. Guo, Xu, Liu, Liu, Jiang, Mu, Zhang, Martin, Cheng, Hu (b52) 2022 He, Chen, Lin (b37) 2021; 13 Tolstikhin, Houlsby, Kolesnikov, Beyer, Zhai, Unterthiner, Yung, Steiner, Keysers, Uszkoreit (b62) 2021; 34 Yuan, Fu, Huang, Lin, Zhang, Chen, Wang (b55) 2021; 34 He, Yang, Zhang, Zhao, Zhang, Xing, Xie (b58) 2020 Xu, Zhang, Zhang, Tao (b11) 2021; 34 MM.C.V. Contributors (b60) 2018 J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141. Howard, Zhu, Chen, Kalenichenko, Wang, Weyand, Andreetto, Adam (b43) 2017 Xu, Mo, Feng, Zhong, Lai, Eric, Chang (b2) 2014 Simonyan, Zisserman (b61) 2014 K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. Ba, Kiros, Hinton (b42) 2016 Z. Peng, W. Huang, S. Gu, L. Xie, Y. Wang, J. Jiao, Q. Ye, Conformer: Local features coupling global representations for visual recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 367–376. S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P.H. Torr, et al., Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6881–6890. Wang, Tian, Yu, Zhou, Wang, Wang (b41) 2023; 161 Li, Wang, Zhang, Gao, Song, Liu, Li, Qiao (b65) 2023 L. Zhu, X. Wang, Z. Ke, W. Zhang, R.W. Lau, BiFormer: Vision Transformer with Bi-Level Routing Attention, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10323–10333. Z. Wan, J. Zhang, D. Chen, J. Liao, High-fidelity pluralistic image completion with transformers, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 4692–4701. Yu, Lin, Meng, Wei, Guo, Zhao (b7) 2017; 8 Gao, Liu, Yang, Chen, Wan, Xiao, Qian (b33) 2021; 14 Jiang, Chang, Wang (b38) 2021; 34 Almalki, Qayyum, Irfan, Haider, Glowacz, Alshehri, Alduraibi, Alshamrani, Alkhalik Basha, Alduraibi (b22) 2021 Touvron, Cord, Douze, Massa, Sablayrolles, Jégou (b28) 2021 Kumar, Kim, Lyndon, Fulham, Feng (b6) 2016; 21 Yan, Ren, Li, Rao, Lv, Zheng, Zhang (b46) 2022; 22 S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, Cbam: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 3–19. Yang, Li, Zhang, Dai, Xiao, Yuan, Gao (b53) 2021 W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, L. Shao, Pyramid vision transformer: A versatile backbone for dense prediction without convolutions, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 568–578. R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626. Min, Zhao, Luo, Cho (b63) 2022; 35 Chen (10.1016/j.bspc.2023.105534_b40) 2022; 130 Zhu (10.1016/j.bspc.2023.105534_b30) 2020 Dosovitskiy (10.1016/j.bspc.2023.105534_b9) 2020 Koitka (10.1016/j.bspc.2023.105534_b1) 2016 Wang (10.1016/j.bspc.2023.105534_b39) 2021 10.1016/j.bspc.2023.105534_b48 Yang (10.1016/j.bspc.2023.105534_b64) 2022; 35 Personnaz (10.1016/j.bspc.2023.105534_b5) 1986; 34 Fu (10.1016/j.bspc.2023.105534_b12) 2022 10.1016/j.bspc.2023.105534_b44 10.1016/j.bspc.2023.105534_b45 Xu (10.1016/j.bspc.2023.105534_b11) 2021; 34 10.1016/j.bspc.2023.105534_b47 Han (10.1016/j.bspc.2023.105534_b24) 2022; 45 Min (10.1016/j.bspc.2023.105534_b63) 2022; 35 Zhou (10.1016/j.bspc.2023.105534_b69) 2021; 109 Ba (10.1016/j.bspc.2023.105534_b42) 2016 Mnih (10.1016/j.bspc.2023.105534_b49) 2014; 27 Irfan (10.1016/j.bspc.2023.105534_b21) 2021; 18 Yan (10.1016/j.bspc.2023.105534_b46) 2022; 22 Baloch (10.1016/j.bspc.2023.105534_b19) 2007; 16 Kumar (10.1016/j.bspc.2023.105534_b6) 2016; 21 Beal (10.1016/j.bspc.2023.105534_b31) 2020 Xu (10.1016/j.bspc.2023.105534_b2) 2014 Yuan (10.1016/j.bspc.2023.105534_b55) 2021; 34 Vahadane (10.1016/j.bspc.2023.105534_b59) 2016; 35 10.1016/j.bspc.2023.105534_b15 Touvron (10.1016/j.bspc.2023.105534_b28) 2021 10.1016/j.bspc.2023.105534_b16 10.1016/j.bspc.2023.105534_b17 10.1016/j.bspc.2023.105534_b18 10.1016/j.bspc.2023.105534_b57 10.1016/j.bspc.2023.105534_b14 Gao (10.1016/j.bspc.2023.105534_b33) 2021; 14 10.1016/j.bspc.2023.105534_b51 Jiang (10.1016/j.bspc.2023.105534_b38) 2021; 34 Jamil (10.1016/j.bspc.2023.105534_b25) 2023; 7 10.1016/j.bspc.2023.105534_b10 10.1016/j.bspc.2023.105534_b54 10.1016/j.bspc.2023.105534_b50 Shen (10.1016/j.bspc.2023.105534_b3) 2017; 61 MM.C.V. Contributors (10.1016/j.bspc.2023.105534_b60) 2018 Cheng (10.1016/j.bspc.2023.105534_b23) 2022; 76 Vaswani (10.1016/j.bspc.2023.105534_b8) 2017; 30 Tolstikhin (10.1016/j.bspc.2023.105534_b62) 2021; 34 Wang (10.1016/j.bspc.2023.105534_b41) 2023; 161 He (10.1016/j.bspc.2023.105534_b58) 2020 10.1016/j.bspc.2023.105534_b66 Khan (10.1016/j.bspc.2023.105534_b26) 2022; 54 10.1016/j.bspc.2023.105534_b67 10.1016/j.bspc.2023.105534_b68 Wu (10.1016/j.bspc.2023.105534_b27) 2020 Valanarasu (10.1016/j.bspc.2023.105534_b36) 2021 Codella (10.1016/j.bspc.2023.105534_b56) 2019 Li (10.1016/j.bspc.2023.105534_b65) 2023 Zhang (10.1016/j.bspc.2023.105534_b13) 2021 Song (10.1016/j.bspc.2023.105534_b20) 2013; 32 Guo (10.1016/j.bspc.2023.105534_b52) 2022 Esteva (10.1016/j.bspc.2023.105534_b4) 2017; 546 Yang (10.1016/j.bspc.2023.105534_b53) 2021 Almalki (10.1016/j.bspc.2023.105534_b22) 2021 Yu (10.1016/j.bspc.2023.105534_b7) 2017; 8 10.1016/j.bspc.2023.105534_b34 Howard (10.1016/j.bspc.2023.105534_b43) 2017 10.1016/j.bspc.2023.105534_b35 10.1016/j.bspc.2023.105534_b32 He (10.1016/j.bspc.2023.105534_b37) 2021; 13 Simonyan (10.1016/j.bspc.2023.105534_b61) 2014 Carion (10.1016/j.bspc.2023.105534_b29) 2020
References_xml	– volume: 21 start-page: 31 year: 2016 end-page: 40 ident: b6 article-title: An ensemble of fine-tuned convolutional neural networks for medical image classification publication-title: IEEE J. Biomed. Health Inform. – volume: 34 start-page: 4217 year: 1986 ident: b5 article-title: Collective computational properties of neural networks: New learning mechanisms publication-title: Phys. Rev. A – reference: R.R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, D. Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 618–626. – volume: 54 start-page: 1 year: 2022 end-page: 41 ident: b26 article-title: Transformers in vision: A survey publication-title: ACM Comput. Surv. – year: 2020 ident: b58 article-title: Sample-Efficient Deep Learning for COVID-19 Diagnosis Based on CT Scans – reference: F. Chollet, Xception: Deep learning with depthwise separable convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1251–1258. – volume: 18 start-page: 3056 year: 2021 ident: b21 article-title: Role of hybrid deep neural networks (HDNNs), computed tomography, and chest X-rays for the detection of COVID-19 publication-title: Int. J. Environ. Res. Public Health – volume: 35 start-page: 32097 year: 2022 end-page: 32111 ident: b63 article-title: Peripheral vision transformer publication-title: Adv. Neural Inf. Process. Syst. – year: 2017 ident: b43 article-title: Mobilenets: Efficient convolutional neural networks for mobile vision applications – volume: 14 start-page: 10990 year: 2021 end-page: 11003 ident: b33 article-title: STransFuse: Fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation publication-title: IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. – volume: 109 start-page: 820 year: 2021 end-page: 838 ident: b69 article-title: A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises publication-title: Proc. IEEE – reference: L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, T.-S. Chua, Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 5659–5667. – volume: 7 start-page: 287 year: 2023 ident: b25 article-title: A comprehensive survey of transformers for computer vision publication-title: Drones – reference: L. Zhu, X. Wang, Z. Ke, W. Zhang, R.W. Lau, BiFormer: Vision Transformer with Bi-Level Routing Attention, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 10323–10333. – reference: J. Guo, K. Han, H. Wu, Y. Tang, X. Chen, Y. Wang, C. Xu, Cmt: Convolutional neural networks meet vision transformers, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12175–12185. – start-page: 36 year: 2021 end-page: 46 ident: b36 article-title: Medical transformer: Gated axial-attention for medical image segmentation publication-title: International Conference on Medical Image Computing and Computer-Assisted Intervention – volume: 34 start-page: 28522 year: 2021 end-page: 28535 ident: b11 article-title: Vitae: Vision transformer advanced by exploring intrinsic inductive bias publication-title: Adv. Neural Inf. Process. Syst. – volume: 45 start-page: 87 year: 2022 end-page: 110 ident: b24 article-title: A survey on vision transformer publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – year: 2018 ident: b60 article-title: MMCV: OpenMMLab computer vision foundation – volume: 546 start-page: 686 year: 2017 ident: b4 article-title: Correction: Corrigendum: Dermatologist-level classification of skin cancer with deep neural networks publication-title: Nature – volume: 130 year: 2022 ident: b40 article-title: GasHis-transformer: A multi-scale visual transformer approach for gastric histopathological image detection publication-title: Pattern Recognit. – volume: 35 start-page: 4203 year: 2022 end-page: 4217 ident: b64 article-title: Focal modulation networks publication-title: Adv. Neural Inf. Process. Syst. – volume: 35 start-page: 1962 year: 2016 end-page: 1971 ident: b59 article-title: Structure-preserving color normalization and sparse stain separation for histological images publication-title: IEEE Trans. Med. Imaging – reference: Z. Wan, J. Zhang, D. Chen, J. Liao, High-fidelity pluralistic image completion with transformers, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 4692–4701. – year: 2023 ident: b65 article-title: Uniformer: Unifying convolution and self-attention for visual recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 16 start-page: 317 year: 2007 end-page: 328 ident: b19 article-title: Flexible skew-symmetric shape model for shape representation, classification, and sampling publication-title: IEEE Trans. Image Process. – volume: 34 start-page: 24261 year: 2021 end-page: 24272 ident: b62 article-title: Mlp-mixer: An all-mlp architecture for vision publication-title: Adv. Neural Inf. Process. Syst. – year: 2019 ident: b56 article-title: Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic) – reference: K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778. – reference: L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, Z.-H. Jiang, F.E. Tay, J. Feng, S. Yan, Tokens-to-token vit: Training vision transformers from scratch on imagenet, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 558–567. – start-page: 14 year: 2021 end-page: 24 ident: b13 article-title: Transfuse: Fusing transformers and cnns for medical image segmentation publication-title: International Conference on Medical Image Computing and Computer-Assisted Intervention – reference: H. Chen, Y. Wang, T. Guo, C. Xu, Y. Deng, Z. Liu, S. Ma, C. Xu, C. Xu, W. Gao, Pre-trained image processing transformer, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 12299–12310. – year: 2021 ident: b53 article-title: Focal self-attention for local-global interactions in vision transformers – start-page: 109 year: 2021 end-page: 119 ident: b39 article-title: Transbts: Multimodal brain tumor segmentation using transformer publication-title: International Conference on Medical Image Computing and Computer-Assisted Intervention – year: 2014 ident: b61 article-title: Very deep convolutional networks for large-scale image recognition – reference: W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, L. Shao, Pyramid vision transformer: A versatile backbone for dense prediction without convolutions, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 568–578. – volume: 27 year: 2014 ident: b49 article-title: Recurrent models of visual attention publication-title: Advances in neural information processing systems – volume: 34 start-page: 7281 year: 2021 end-page: 7293 ident: b55 article-title: Hrformer: High-resolution vision transformer for dense predict publication-title: Adv. Neural Inf. Process. Syst. – reference: Z. Liu, H. Mao, C.-Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, A convnet for the 2020s, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11976–11986. – reference: J. Hu, L. Shen, G. Sun, Squeeze-and-excitation networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132–7141. – reference: K. Pogorelov, K.R. Randel, C. Griwodz, S.L. Eskeland, T. de Lange, D. Johansen, C. Spampinato, D.-T. Dang-Nguyen, M. Lux, P.T. Schmidt, et al., Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection, in: Proceedings of the 8th ACM on Multimedia Systems Conference, 2017, pp. 164–169. – volume: 22 start-page: 4061 year: 2022 ident: b46 article-title: Nuclei-guided network for breast cancer grading in he-stained pathological images publication-title: Sensors – start-page: 213 year: 2020 end-page: 229 ident: b29 article-title: End-to-end object detection with transformers publication-title: European Conference on Computer Vision – year: 2016 ident: b42 article-title: Layer normalization – reference: H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, H. Jegou, Training data-efficient image transformers I& amp; distillation through attention, in: International Conference on Machine Learning, Vol. 139, 2021, pp. 10347–10357. – start-page: 304 year: 2016 end-page: 317 ident: b1 article-title: Traditional feature engineering and deep learning approaches at medical classification task of imageclef 2016 publication-title: CLEF (Working Notes) – year: 2022 ident: b12 article-title: StoHisNet: A hybrid multi-classification model with CNN and transformer for gastric pathology images publication-title: Comput. Methods Programs Biomed. – start-page: 1626 year: 2014 end-page: 1630 ident: b2 article-title: Deep learning of feature representation with multiple instance learning for medical image analysis publication-title: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing – reference: S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, Cbam: Convolutional block attention module, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 3–19. – start-page: 522 year: 2021 ident: b22 article-title: A novel method for COVID-19 diagnosis using artificial intelligence in chest X-ray images publication-title: Healthcare, Vol. 9, No. 5 – volume: 161 year: 2023 ident: b41 article-title: HIGF-Net: Hierarchical information-guided fusion network for polyp segmentation based on transformer and convolution feature learning publication-title: Comput. Biol. Med. – start-page: 10347 year: 2021 end-page: 10357 ident: b28 article-title: Training data-efficient image transformers & distillation through attention publication-title: International Conference on Machine Learning – volume: 13 start-page: 498 year: 2021 ident: b37 article-title: Spatial-spectral transformer for hyperspectral image classification publication-title: Remote Sens. – volume: 32 start-page: 797 year: 2013 end-page: 808 ident: b20 article-title: Feature-based image patch approximation for lung tissue classification publication-title: IEEE Trans. Med. Imaging – year: 2020 ident: b31 article-title: Toward transformer-based object detection – reference: S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y. Wang, Y. Fu, J. Feng, T. Xiang, P.H. Torr, et al., Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 6881–6890. – start-page: 1 year: 2022 end-page: 38 ident: b52 article-title: Attention mechanisms in computer vision: A survey publication-title: Comput. Vis. Media – volume: 8 start-page: 91 year: 2017 ident: b7 article-title: Deep transfer learning for modality classification of medical images publication-title: Information – volume: 61 start-page: 663 year: 2017 end-page: 673 ident: b3 article-title: Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification publication-title: Pattern Recognit. – volume: 30 year: 2017 ident: b8 article-title: Attention is all you need publication-title: Adv. Neural Inf. Process. Syst. – year: 2020 ident: b9 article-title: An image is worth 16 × 16 words: Transformers for image recognition at scale – volume: 34 start-page: 14745 year: 2021 end-page: 14758 ident: b38 article-title: Transgan: Two pure transformers can make one strong gan, and that can scale up publication-title: Adv. Neural Inf. Process. Syst. – volume: 76 year: 2022 ident: b23 article-title: ResGANet: Residual group attention network for medical image classification and segmentation publication-title: Med. Image Anal. – reference: J. Dai, H. Qi, Y. Xiong, Y. Li, G. Zhang, H. Hu, Y. Wei, Deformable convolutional networks, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 764–773. – reference: S. Xie, R. Girshick, P. Dollár, Z. Tu, K. He, Aggregated residual transformations for deep neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1492–1500. – year: 2020 ident: b30 article-title: Deformable detr: Deformable transformers for end-to-end object detection – reference: Z. Peng, W. Huang, S. Gu, L. Xie, Y. Wang, J. Jiao, Q. Ye, Conformer: Local features coupling global representations for visual recognition, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 367–376. – year: 2020 ident: b27 article-title: Visual transformers: Token-based image representation and processing for computer vision – reference: Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022. – volume: 32 start-page: 797 issue: 4 year: 2013 ident: 10.1016/j.bspc.2023.105534_b20 article-title: Feature-based image patch approximation for lung tissue classification publication-title: IEEE Trans. Med. Imaging doi: 10.1109/TMI.2013.2241448 – volume: 27 year: 2014 ident: 10.1016/j.bspc.2023.105534_b49 article-title: Recurrent models of visual attention publication-title: Advances in neural information processing systems – year: 2023 ident: 10.1016/j.bspc.2023.105534_b65 article-title: Uniformer: Unifying convolution and self-attention for visual recognition publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – ident: 10.1016/j.bspc.2023.105534_b17 doi: 10.1109/CVPR52688.2022.01167 – volume: 35 start-page: 4203 year: 2022 ident: 10.1016/j.bspc.2023.105534_b64 article-title: Focal modulation networks publication-title: Adv. Neural Inf. Process. Syst. – volume: 35 start-page: 32097 year: 2022 ident: 10.1016/j.bspc.2023.105534_b63 article-title: Peripheral vision transformer publication-title: Adv. Neural Inf. Process. Syst. – year: 2018 ident: 10.1016/j.bspc.2023.105534_b60 – volume: 16 start-page: 317 issue: 2 year: 2007 ident: 10.1016/j.bspc.2023.105534_b19 article-title: Flexible skew-symmetric shape model for shape representation, classification, and sampling publication-title: IEEE Trans. Image Process. doi: 10.1109/TIP.2006.888348 – ident: 10.1016/j.bspc.2023.105534_b47 doi: 10.1109/CVPR.2018.00745 – ident: 10.1016/j.bspc.2023.105534_b35 doi: 10.1109/ICCV48922.2021.00465 – start-page: 304 year: 2016 ident: 10.1016/j.bspc.2023.105534_b1 article-title: Traditional feature engineering and deep learning approaches at medical classification task of imageclef 2016 – start-page: 1626 year: 2014 ident: 10.1016/j.bspc.2023.105534_b2 article-title: Deep learning of feature representation with multiple instance learning for medical image analysis – start-page: 14 year: 2021 ident: 10.1016/j.bspc.2023.105534_b13 article-title: Transfuse: Fusing transformers and cnns for medical image segmentation – ident: 10.1016/j.bspc.2023.105534_b51 doi: 10.1109/ICCV.2017.89 – ident: 10.1016/j.bspc.2023.105534_b57 doi: 10.1145/3083187.3083212 – volume: 546 start-page: 686 issue: 7660 year: 2017 ident: 10.1016/j.bspc.2023.105534_b4 article-title: Correction: Corrigendum: Dermatologist-level classification of skin cancer with deep neural networks publication-title: Nature doi: 10.1038/nature22985 – ident: 10.1016/j.bspc.2023.105534_b18 doi: 10.1109/ICCV48922.2021.00986 – start-page: 36 year: 2021 ident: 10.1016/j.bspc.2023.105534_b36 article-title: Medical transformer: Gated axial-attention for medical image segmentation – year: 2016 ident: 10.1016/j.bspc.2023.105534_b42 – volume: 34 start-page: 7281 year: 2021 ident: 10.1016/j.bspc.2023.105534_b55 article-title: Hrformer: High-resolution vision transformer for dense predict publication-title: Adv. Neural Inf. Process. Syst. – volume: 61 start-page: 663 year: 2017 ident: 10.1016/j.bspc.2023.105534_b3 article-title: Multi-crop convolutional neural networks for lung nodule malignancy suspiciousness classification publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2016.05.029 – ident: 10.1016/j.bspc.2023.105534_b32 doi: 10.1109/CVPR46437.2021.00681 – ident: 10.1016/j.bspc.2023.105534_b15 doi: 10.1109/ICCV48922.2021.00042 – year: 2021 ident: 10.1016/j.bspc.2023.105534_b53 – start-page: 109 year: 2021 ident: 10.1016/j.bspc.2023.105534_b39 article-title: Transbts: Multimodal brain tumor segmentation using transformer – volume: 161 year: 2023 ident: 10.1016/j.bspc.2023.105534_b41 article-title: HIGF-Net: Hierarchical information-guided fusion network for polyp segmentation based on transformer and convolution feature learning publication-title: Comput. Biol. Med. doi: 10.1016/j.compbiomed.2023.107038 – ident: 10.1016/j.bspc.2023.105534_b16 doi: 10.1109/CVPR.2016.90 – year: 2020 ident: 10.1016/j.bspc.2023.105534_b27 – volume: 109 start-page: 820 issue: 5 year: 2021 ident: 10.1016/j.bspc.2023.105534_b69 article-title: A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises publication-title: Proc. IEEE doi: 10.1109/JPROC.2021.3054390 – volume: 13 start-page: 498 issue: 3 year: 2021 ident: 10.1016/j.bspc.2023.105534_b37 article-title: Spatial-spectral transformer for hyperspectral image classification publication-title: Remote Sens. doi: 10.3390/rs13030498 – volume: 130 year: 2022 ident: 10.1016/j.bspc.2023.105534_b40 article-title: GasHis-transformer: A multi-scale visual transformer approach for gastric histopathological image detection publication-title: Pattern Recognit. doi: 10.1016/j.patcog.2022.108827 – ident: 10.1016/j.bspc.2023.105534_b10 doi: 10.1109/ICCV48922.2021.00060 – volume: 14 start-page: 10990 year: 2021 ident: 10.1016/j.bspc.2023.105534_b33 article-title: STransFuse: Fusing swin transformer and convolutional neural network for remote sensing image semantic segmentation publication-title: IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. doi: 10.1109/JSTARS.2021.3119654 – ident: 10.1016/j.bspc.2023.105534_b44 doi: 10.1109/CVPR.2017.195 – year: 2020 ident: 10.1016/j.bspc.2023.105534_b9 – ident: 10.1016/j.bspc.2023.105534_b14 doi: 10.1109/CVPR52688.2022.01186 – year: 2019 ident: 10.1016/j.bspc.2023.105534_b56 – volume: 7 start-page: 287 issue: 5 year: 2023 ident: 10.1016/j.bspc.2023.105534_b25 article-title: A comprehensive survey of transformers for computer vision publication-title: Drones doi: 10.3390/drones7050287 – year: 2020 ident: 10.1016/j.bspc.2023.105534_b30 – start-page: 1 year: 2022 ident: 10.1016/j.bspc.2023.105534_b52 article-title: Attention mechanisms in computer vision: A survey publication-title: Comput. Vis. Media – volume: 34 start-page: 24261 year: 2021 ident: 10.1016/j.bspc.2023.105534_b62 article-title: Mlp-mixer: An all-mlp architecture for vision publication-title: Adv. Neural Inf. Process. Syst. – volume: 45 start-page: 87 issue: 1 year: 2022 ident: 10.1016/j.bspc.2023.105534_b24 article-title: A survey on vision transformer publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2022.3152247 – volume: 54 start-page: 1 issue: 10s year: 2022 ident: 10.1016/j.bspc.2023.105534_b26 article-title: Transformers in vision: A survey publication-title: ACM Comput. Surv. doi: 10.1145/3505244 – volume: 8 start-page: 91 issue: 3 year: 2017 ident: 10.1016/j.bspc.2023.105534_b7 article-title: Deep transfer learning for modality classification of medical images publication-title: Information doi: 10.3390/info8030091 – ident: 10.1016/j.bspc.2023.105534_b66 doi: 10.1109/CVPR52729.2023.00995 – ident: 10.1016/j.bspc.2023.105534_b67 – ident: 10.1016/j.bspc.2023.105534_b68 doi: 10.1109/ICCV.2017.74 – volume: 18 start-page: 3056 issue: 6 year: 2021 ident: 10.1016/j.bspc.2023.105534_b21 article-title: Role of hybrid deep neural networks (HDNNs), computed tomography, and chest X-rays for the detection of COVID-19 publication-title: Int. J. Environ. Res. Public Health doi: 10.3390/ijerph18063056 – ident: 10.1016/j.bspc.2023.105534_b34 doi: 10.1109/CVPR46437.2021.01212 – ident: 10.1016/j.bspc.2023.105534_b54 doi: 10.1109/ICCV48922.2021.00061 – start-page: 522 year: 2021 ident: 10.1016/j.bspc.2023.105534_b22 article-title: A novel method for COVID-19 diagnosis using artificial intelligence in chest X-ray images – volume: 21 start-page: 31 issue: 1 year: 2016 ident: 10.1016/j.bspc.2023.105534_b6 article-title: An ensemble of fine-tuned convolutional neural networks for medical image classification publication-title: IEEE J. Biomed. Health Inform. doi: 10.1109/JBHI.2016.2635663 – start-page: 213 year: 2020 ident: 10.1016/j.bspc.2023.105534_b29 article-title: End-to-end object detection with transformers – year: 2022 ident: 10.1016/j.bspc.2023.105534_b12 article-title: StoHisNet: A hybrid multi-classification model with CNN and transformer for gastric pathology images publication-title: Comput. Methods Programs Biomed. doi: 10.1016/j.cmpb.2022.106924 – year: 2014 ident: 10.1016/j.bspc.2023.105534_b61 – start-page: 10347 year: 2021 ident: 10.1016/j.bspc.2023.105534_b28 article-title: Training data-efficient image transformers & distillation through attention – volume: 34 start-page: 4217 issue: 5 year: 1986 ident: 10.1016/j.bspc.2023.105534_b5 article-title: Collective computational properties of neural networks: New learning mechanisms publication-title: Phys. Rev. A doi: 10.1103/PhysRevA.34.4217 – volume: 35 start-page: 1962 issue: 8 year: 2016 ident: 10.1016/j.bspc.2023.105534_b59 article-title: Structure-preserving color normalization and sparse stain separation for histological images publication-title: IEEE Trans. Med. Imaging doi: 10.1109/TMI.2016.2529665 – volume: 22 start-page: 4061 issue: 11 year: 2022 ident: 10.1016/j.bspc.2023.105534_b46 article-title: Nuclei-guided network for breast cancer grading in he-stained pathological images publication-title: Sensors doi: 10.3390/s22114061 – ident: 10.1016/j.bspc.2023.105534_b50 doi: 10.1007/978-3-030-01234-2_1 – volume: 34 start-page: 28522 year: 2021 ident: 10.1016/j.bspc.2023.105534_b11 article-title: Vitae: Vision transformer advanced by exploring intrinsic inductive bias publication-title: Adv. Neural Inf. Process. Syst. – ident: 10.1016/j.bspc.2023.105534_b48 doi: 10.1109/CVPR.2017.667 – ident: 10.1016/j.bspc.2023.105534_b45 doi: 10.1109/CVPR.2017.634 – volume: 76 year: 2022 ident: 10.1016/j.bspc.2023.105534_b23 article-title: ResGANet: Residual group attention network for medical image classification and segmentation publication-title: Med. Image Anal. doi: 10.1016/j.media.2021.102313 – volume: 34 start-page: 14745 year: 2021 ident: 10.1016/j.bspc.2023.105534_b38 article-title: Transgan: Two pure transformers can make one strong gan, and that can scale up publication-title: Adv. Neural Inf. Process. Syst. – year: 2020 ident: 10.1016/j.bspc.2023.105534_b31 – year: 2017 ident: 10.1016/j.bspc.2023.105534_b43 – year: 2020 ident: 10.1016/j.bspc.2023.105534_b58 – volume: 30 year: 2017 ident: 10.1016/j.bspc.2023.105534_b8 article-title: Attention is all you need publication-title: Adv. Neural Inf. Process. Syst.
SSID	ssj0048714
Score	2.6458194
Snippet	Effective fusion of global and local multi-scale features is crucial for medical image classification. Medical images have many noisy, scattered features,...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	105534
SubjectTerms	Feature fusion Hybrid network Medical image classification Multi-scale feature Swin-Transformer
Title	HiFuse: Hierarchical multi-scale feature fusion network for medical image classification
URI	https://dx.doi.org/10.1016/j.bspc.2023.105534
Volume	87
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELaqssCAeIryqDywodA6tvNgqyqqAKILVOoWxS8pqA0VbVd-O2c7qYqEOrDlcY6Si33-4nz3HUK3knMNMJkGRuokYAUlMOZiFui-jkOamihyOgWv4yibsOcpn7bQsMmFsbTKOvb7mO6idX2kV3uztyjL3htg6SiBrxMA0YDDY5vEx1hse_n994bmAXjc6Xtb48Ba14kznuMllgsrYxhSW-6WU_b35LQ14YyO0GGNFPHA38wxaunqBB1s6QeeomlWjtZL_YCz0uYRu7ImM-w4gsESNjU22gl3YrO2q2K48qRvDEgVz_0vGlzOIaRgaVG0pQ25N3WGJqPH92EW1KUSAglPvwpMGKvC9E3IuJBGRULYEvIaJvjEMBFpLRNFWEESlVIYtpTI0EpzaUIiTYwO6TlqV5-VvkAYztBQpSYUMWWMmSKOFFw1VYQrI1LeQaTxUS5rHXFbzmKWN4Sxj9z6Nbd-zb1fO-hu02bhVTR2WvPG9fmvvpBDmN_R7vKf7a7QPuwxv7Byjdqrr7W-AaixEl3Xl7pob_D0ko1_ACil0oI
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZKGYAB8RTl6YENheJnEjZUUQVou9BK3aLEsaWgNlS0XfntnOOkKhLqwBbFvii5-M5fnM_fIXSrhNAAk5lnlA48njACMedzTz9on7LQSFnqFPQHMhrx17EYN1Cn3gtjaZVV7nc5vczW1Zl25c32LM_b74ClZQBfJwCiAYf74Rba5hC-tozB_feK5wGAvBT4tr09273aOeNIXul8ZnUMKbP1bgXjf89OazNO9wDtV1ARP7m7OUQNXRyhvTUBwWM0jvLucq4fcZTbjcRlXZMJLkmC3hwONTa6VO7EZmmXxXDhWN8YoCqeun80OJ9CTsHKwmjLGypf1QkadZ-HnciraiV4Ch5_4RnqZ4l5MJSLVJlMpqmtIa9hhg8MT6XWKsgIT0iQhQzilhFFrTaXJkRqYjRlp6hZfBb6DGFoYTQLDU19xjk3iS8zuGqYEZGZNBQtRGofxaoSErf1LCZxzRj7iK1fY-vX2Pm1he5WNjMno7Gxt6hdH_8aDDHk-Q125_-0u0E70bDfi3svg7cLtAst3K2yXKLm4muprwB3LNLrclz9ACKw1BA
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=HiFuse%3A+Hierarchical+multi-scale+feature+fusion+network+for+medical+image+classification&rft.jtitle=Biomedical+signal+processing+and+control&rft.au=Huo%2C+Xiangzuo&rft.au=Sun%2C+Gang&rft.au=Tian%2C+Shengwei&rft.au=Wang%2C+Yan&rft.date=2024-01-01&rft.issn=1746-8094&rft.volume=87&rft.spage=105534&rft_id=info:doi/10.1016%2Fj.bspc.2023.105534&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_bspc_2023_105534
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1746-8094&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1746-8094&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1746-8094&client=summon