Are vision transformers replacing convolutional neural networks in scene interpretation?: A review
Visual scene interpretation is a significant and daunting process of observing, exploring, and elaborating dynamic scenes. It provides reliable and safe communication with the natural world and environmental affairs. Cutting-edge computer vision technology plays a key role in enabling communication...
Saved in:
Published in | Discover applied sciences Vol. 7; no. 9; pp. 932 - 21 |
---|---|
Main Authors | , , |
Format | Journal Article |
Language | English |
Published |
Cham
Springer International Publishing
01.09.2025
Springer Nature B.V Springer |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Visual scene interpretation is a significant and daunting process of observing, exploring, and elaborating dynamic scenes. It provides reliable and safe communication with the natural world and environmental affairs. Cutting-edge computer vision technology plays a key role in enabling communication that allows individuals to understand visual scenes in the same way they do. Technical advancements in computer vision have been overwhelmingly successful, primarily driven by the harnessing of deep learning algorithms. Recently, Vision Transformers (ViTs) have emerged as a viable alternative to conventional neural networks. Powered by an attention mechanism, ViT-based approaches have demonstrated competitive or superior performance to CNNs in several benchmark scene interpretation tasks. This research carries out a detailed and inclusive exploration of the scene recognition approaches using Convolutional Neural Networks (CNN) and ViTs. This article aims to present a comprehensive study of the existing advanced research views for CNNs and ViTs in scene recognition. This review presents a comprehensive and methodical analysis of recent developments in CNN and ViT-based models for scene recognition. A total of 142 peer-reviewed studies published between 2017 and 2024 were reviewed based on defined inclusion criteria, focusing on works that evaluate these models on public datasets. The review begins with an overview of the architectural foundations and functional variations of CNNs used for scene interpretation. Next, it explores the structure of ViTs, including their multi-head self-attention mechanisms, and assesses state-of-the-art ViT variants with respect to design innovations, training strategies, and performance metrics. As a final point, we discuss some possible future research directions for designing ViT models. Hence, this study can be employed as a reference for scholars and experts to develop new ViT architectures in this domain. |
---|---|
AbstractList | Visual scene interpretation is a significant and daunting process of observing, exploring, and elaborating dynamic scenes. It provides reliable and safe communication with the natural world and environmental affairs. Cutting-edge computer vision technology plays a key role in enabling communication that allows individuals to understand visual scenes in the same way they do. Technical advancements in computer vision have been overwhelmingly successful, primarily driven by the harnessing of deep learning algorithms. Recently, Vision Transformers (ViTs) have emerged as a viable alternative to conventional neural networks. Powered by an attention mechanism, ViT-based approaches have demonstrated competitive or superior performance to CNNs in several benchmark scene interpretation tasks. This research carries out a detailed and inclusive exploration of the scene recognition approaches using Convolutional Neural Networks (CNN) and ViTs. This article aims to present a comprehensive study of the existing advanced research views for CNNs and ViTs in scene recognition. This review presents a comprehensive and methodical analysis of recent developments in CNN and ViT-based models for scene recognition. A total of 142 peer-reviewed studies published between 2017 and 2024 were reviewed based on defined inclusion criteria, focusing on works that evaluate these models on public datasets. The review begins with an overview of the architectural foundations and functional variations of CNNs used for scene interpretation. Next, it explores the structure of ViTs, including their multi-head self-attention mechanisms, and assesses state-of-the-art ViT variants with respect to design innovations, training strategies, and performance metrics. As a final point, we discuss some possible future research directions for designing ViT models. Hence, this study can be employed as a reference for scholars and experts to develop new ViT architectures in this domain. Abstract Visual scene interpretation is a significant and daunting process of observing, exploring, and elaborating dynamic scenes. It provides reliable and safe communication with the natural world and environmental affairs. Cutting-edge computer vision technology plays a key role in enabling communication that allows individuals to understand visual scenes in the same way they do. Technical advancements in computer vision have been overwhelmingly successful, primarily driven by the harnessing of deep learning algorithms. Recently, Vision Transformers (ViTs) have emerged as a viable alternative to conventional neural networks. Powered by an attention mechanism, ViT-based approaches have demonstrated competitive or superior performance to CNNs in several benchmark scene interpretation tasks. This research carries out a detailed and inclusive exploration of the scene recognition approaches using Convolutional Neural Networks (CNN) and ViTs. This article aims to present a comprehensive study of the existing advanced research views for CNNs and ViTs in scene recognition. This review presents a comprehensive and methodical analysis of recent developments in CNN and ViT-based models for scene recognition. A total of 142 peer-reviewed studies published between 2017 and 2024 were reviewed based on defined inclusion criteria, focusing on works that evaluate these models on public datasets. The review begins with an overview of the architectural foundations and functional variations of CNNs used for scene interpretation. Next, it explores the structure of ViTs, including their multi-head self-attention mechanisms, and assesses state-of-the-art ViT variants with respect to design innovations, training strategies, and performance metrics. As a final point, we discuss some possible future research directions for designing ViT models. Hence, this study can be employed as a reference for scholars and experts to develop new ViT architectures in this domain. |
ArticleNumber | 932 |
Author | Balasubadra, K. Rosy, N. Arockia Deepa, K. |
Author_xml | – sequence: 1 givenname: N. Arockia surname: Rosy fullname: Rosy, N. Arockia organization: Department of Information Technology, R.M.D. Engineering College – sequence: 2 givenname: K. surname: Balasubadra fullname: Balasubadra, K. organization: Department of Information Technology, R.M.D. Engineering College – sequence: 3 givenname: K. surname: Deepa fullname: Deepa, K. email: kdeepa@kiu.ac.ug organization: Department of Civil Engineering, School of Applied Science, Kampala International University |
BookMark | eNp9kU1r20AQhpfgQFLHfyAnQc9qZr-02l6KCU0aMPTSnpfVamTkKLvurOzQf1_ZKk1POc2wPO-zDO8HtogpImO3HD5xAHOXlVBalCB0CUYbVfILdi0BVGlFxRf_7VdslfMOAKQEY7S9Zs2asDj2uU-xGMnH3CV6QcoF4X7woY_bIqR4TMNhnBA_FBEPdB7ja6LnXPSxyAEjTsuItCcc_Yn88rlYT45jj6837LLzQ8bV37lkPx--_rj_Vm6-Pz7drzdlEFry0jQ6CIFBCxt0sFVbtY2orMG6ko23aFuwjWh5q6DRvqs0WOiEN9BJ1A2XcsmeZm-b_M7tqX_x9Nsl37vzQ6Kt8zT2YUBXa6988HWFrVHYBcubztdGSWVAqfbk-ji79pR-HTCPbpcONN2fnRTSWinqWkyUmKlAKWfC7t-vHNypGjdX46Zq3Lkax6eQnEN5guMW6U39TuoPotGT7Q |
Cites_doi | 10.3390/s23052385 10.1145/3546157.3546166 10.1109/JIOT.2022.3176126 10.3390/s23146422 10.3390/computers12080151 10.1145/3505244 10.1186/s40537-021-00444-8 10.1109/TIP.2022.3162964 10.1016/j.apacoust.2023.109411 10.3390/rs13112216 10.1109/TNNLS.2019.2920374 10.3390/s19184024 10.3390/rs14030592 10.3390/drones7050287 10.1109/TPAMI.2022.3152247 10.1109/ICCV48922.2021.00041 10.1109/TGRS.2019.2917161 10.1016/j.isprsjprs.2018.01.004 10.3390/app14052024 10.3390/app13095521 10.1016/j.neunet.2022.06.038 10.1109/TGRS.2019.2909695 10.1109/ICCCR54399.2022.9790134 10.3390/math11051127 10.3390/rs15194804 10.3390/s23167050 10.1016/j.autcon.2022.104316 10.1016/j.injury.2022.04.013 10.1145/3512732.3533582 10.1016/j.procs.2023.01.209 10.5244/C.35.68 10.1145/3530811 10.3390/rs16010174 10.1007/978-3-030-87237-3_5 10.1109/ICCVW54120.2021.00252 10.3390/s20071999 10.1007/s41095-021-0247-3 10.1186/s42492-023-00140-9 10.3390/agriculture13050936 10.1109/TGRS.2019.2931801 10.3390/electronics10040371 10.3390/ai3020016 10.1109/ICECA52323.2021.9676146 |
ContentType | Journal Article |
Copyright | The Author(s) 2025 The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: The Author(s) 2025 – notice: The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | C6C AAYXX CITATION 3V. 7XB 88I 8FE 8FG 8FK ABJCF ABUWG AEUYN AFKRA ATCPS AZQEC BENPR BGLVJ BHPHI BKSAR CCPQU D1I DWQXO GNUQQ HCIFZ KB. L6V M2P M7S PATMY PCBAR PDBOC PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PTHSS PYCSY Q9U DOA |
DOI | 10.1007/s42452-025-07574-1 |
DatabaseName | Springer Nature OA Free Journals CrossRef ProQuest Central (Corporate) ProQuest Central (purchase pre-March 2016) Science Database (Alumni Edition) ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest One Sustainability (subscription) ProQuest Central UK/Ireland Agricultural & Environmental Science Collection ProQuest Central Essentials ProQuest Central Technology Collection (via ProQuest SciTech Premium Collection) Natural Science Collection Earth, Atmospheric & Aquatic Science ProQuest One Community College ProQuest Materials Science Collection ProQuest Central Korea ProQuest Central Student SciTech Premium Collection Materials Science Database ProQuest Engineering Collection Science Database Engineering Database Environmental Science Database (subscripiton) Earth, Atmospheric & Aquatic Science Database Materials Science Collection ProQuest Central Premium ProQuest One Academic Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition Engineering Collection Environmental Science Collection ProQuest Central Basic DOAJ Directory of Open Access Journals |
DatabaseTitle | CrossRef Publicly Available Content Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Central Essentials Materials Science Collection ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central Earth, Atmospheric & Aquatic Science Collection ProQuest One Applied & Life Sciences ProQuest One Sustainability ProQuest Engineering Collection Natural Science Collection ProQuest Central Korea Agricultural & Environmental Science Collection Materials Science Database ProQuest Central (New) Engineering Collection ProQuest Materials Science Collection Engineering Database ProQuest Science Journals (Alumni Edition) ProQuest Central Basic ProQuest Science Journals ProQuest One Academic Eastern Edition Earth, Atmospheric & Aquatic Science Database ProQuest Technology Collection ProQuest SciTech Collection Environmental Science Collection ProQuest One Academic UKI Edition Materials Science & Engineering Collection Environmental Science Database ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: C6C name: Springer Nature OA Free Journals url: http://www.springeropen.com/ sourceTypes: Publisher – sequence: 2 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 3 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Architecture |
EISSN | 3004-9261 2523-3971 |
EndPage | 21 |
ExternalDocumentID | oai_doaj_org_article_85a4aca86ed74efc91bfa874347044d3 10_1007_s42452_025_07574_1 |
GroupedDBID | AAJSJ AASML ADMLS ALMA_UNASSIGNED_HOLDINGS C6C GROUPED_DOAJ M~E SOJ AAYXX CITATION 0R~ 3V. 7XB 88I 8FE 8FG 8FK AAHNG AAKKN ABDZT ABECU ABEEZ ABFTV ABHQN ABJCF ABKCH ABMQK ABTMW ABUWG ABXPI ACACY ACMLO ACOKC ACSTC ACULB ADKNI ADURQ ADYFF AEJRE AEUYN AFGXO AFKRA AFQWF AGDGC AGJBK AILAN AITGF AJZVZ AMKLP ATCPS AXYYD AZQEC BAPOH BENPR BGLVJ BHPHI BKSAR C24 CCPQU D1I DWQXO EBLON EBS FNLPD GNUQQ GNWQR HCIFZ J-C KB. KOV L6V M2P M7S NQJWS OK1 PATMY PCBAR PDBOC PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PTHSS PYCSY Q9U STPWE TSG UOJIU UTJUX VEKWB VFIZW ZMTXR BGNMA M4Y NU0 |
ID | FETCH-LOGICAL-c2531-7b5c22ec529c5c96d6db2697e863ba9e9d09b2d1d40b5af65090f2a70f3e5b133 |
IEDL.DBID | BENPR |
ISSN | 3004-9261 2523-3963 |
IngestDate | Wed Aug 27 01:08:23 EDT 2025 Fri Aug 22 05:10:51 EDT 2025 Thu Aug 21 00:39:08 EDT 2025 Sat Aug 16 01:10:37 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 9 |
Keywords | Linear embedding Multi head attention Scene interpretation Convolutional neural networks Vision transformers |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c2531-7b5c22ec529c5c96d6db2697e863ba9e9d09b2d1d40b5af65090f2a70f3e5b133 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
OpenAccessLink | https://www.proquest.com/docview/3239932882?pq-origsite=%requestingapplication% |
PQID | 3239932882 |
PQPubID | 5758472 |
PageCount | 21 |
ParticipantIDs | doaj_primary_oai_doaj_org_article_85a4aca86ed74efc91bfa874347044d3 proquest_journals_3239932882 crossref_primary_10_1007_s42452_025_07574_1 springer_journals_10_1007_s42452_025_07574_1 |
PublicationCentury | 2000 |
PublicationDate | 2025-09-01 |
PublicationDateYYYYMMDD | 2025-09-01 |
PublicationDate_xml | – month: 09 year: 2025 text: 2025-09-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | Cham |
PublicationPlace_xml | – name: Cham – name: London |
PublicationTitle | Discover applied sciences |
PublicationTitleAbbrev | Discov Appl Sci |
PublicationYear | 2025 |
Publisher | Springer International Publishing Springer Nature B.V Springer |
Publisher_xml | – name: Springer International Publishing – name: Springer Nature B.V – name: Springer |
References | D Santos (7574_CR2) 2019; 19 S-M Tseng (7574_CR3) 2023; 23 K Al-Hammuri (7574_CR25) 2023; 6 K Xu (7574_CR36) 2022; 60 L Alzubaidi (7574_CR7) 2021; 8 J Maurício (7574_CR46) 2023; 13 7574_CR16 M Aggarwal (7574_CR13) 2023; 13 7574_CR14 7574_CR55 7574_CR10 7574_CR54 7574_CR50 A Vaswani (7574_CR8) 2017; 30 M Krichen (7574_CR6) 2023; 12 D Yu (7574_CR17) 2020; 20 Y Xu (7574_CR27) 2022; 8 T Abualsaud (7574_CR12) 2023; 7 N Aryal (7574_CR15) 2023; 210 Z Xue (7574_CR31) 2022; 31 7574_CR29 S Khan (7574_CR9) 2022; 54 P Deng (7574_CR30) 2021; 19 L Tanzi (7574_CR33) 2022; 53 R Reedha (7574_CR43) 2022; 14 W Zhou (7574_CR5) 2018; 145 O Uparkar (7574_CR41) 2023; 218 Y Ran (7574_CR20) 2024; 16 AE Shamsabadi (7574_CR52) 2022; 140 N He (7574_CR21) 2019; 31 7574_CR34 M Filipiuk (7574_CR39) 2022; 3087 H Sun (7574_CR18) 2019; 58 C Feng (7574_CR1) 2024; 14 J Xie (7574_CR57) 2019; 57 Z Sha (7574_CR38) 2022; 19 C Xin (7574_CR53) 2022; 149 X Lu (7574_CR11) 2019; 57 D Say (7574_CR56) 2023; 23 Y Said (7574_CR26) 2023; 11 Y Wu (7574_CR44) 2021; 66 S Jamil (7574_CR32) 2022; 3 7574_CR49 7574_CR48 7574_CR47 Y Lee (7574_CR19) 2021; 10 Z Wang (7574_CR37) 2022; 9 7574_CR45 Y Qing (7574_CR28) 2021; 13 7574_CR42 A Thapa (7574_CR4) 2023; 15 7574_CR40 K Han (7574_CR23) 2022; 45 J Chen (7574_CR35) 2022; 45 AM Ali (7574_CR24) 2023; 23 Y Tay (7574_CR22) 2020; 55 A Bakhtiarnia (7574_CR51) 2022; 153 |
References_xml | – volume: 23 start-page: 2385 issue: 5 year: 2023 ident: 7574_CR24 publication-title: Sensors doi: 10.3390/s23052385 – ident: 7574_CR50 doi: 10.1145/3546157.3546166 – volume: 9 start-page: 20975 year: 2022 ident: 7574_CR37 publication-title: IEEE Internet Things J doi: 10.1109/JIOT.2022.3176126 – ident: 7574_CR40 – volume: 23 start-page: 6422 year: 2023 ident: 7574_CR56 publication-title: Sensors (Basel) doi: 10.3390/s23146422 – volume: 12 issue: 8 year: 2023 ident: 7574_CR6 publication-title: Computers doi: 10.3390/computers12080151 – volume: 54 start-page: 1 year: 2022 ident: 7574_CR9 publication-title: ACM Comput Surv (CSUR) doi: 10.1145/3505244 – volume: 8 start-page: 53 year: 2021 ident: 7574_CR7 publication-title: J Big Data doi: 10.1186/s40537-021-00444-8 – ident: 7574_CR47 – volume: 3087 start-page: 1 year: 2022 ident: 7574_CR39 publication-title: AAAI Workshop Artif Intell Saf – ident: 7574_CR54 – volume: 31 start-page: 3095 year: 2022 ident: 7574_CR31 publication-title: IEEE Trans Image Process doi: 10.1109/TIP.2022.3162964 – volume: 210 year: 2023 ident: 7574_CR15 publication-title: Appl Acoust doi: 10.1016/j.apacoust.2023.109411 – volume: 13 start-page: 2216 issue: 11 year: 2021 ident: 7574_CR28 publication-title: Remote Sens doi: 10.3390/rs13112216 – volume: 19 start-page: 1 year: 2021 ident: 7574_CR30 publication-title: IEEE Geosci Remote Sens Lett – volume: 31 start-page: 1461 year: 2019 ident: 7574_CR21 publication-title: IEEE Trans Neural Netw Learn Syst doi: 10.1109/TNNLS.2019.2920374 – volume: 45 year: 2022 ident: 7574_CR35 publication-title: J Food Process Eng – volume: 19 start-page: 4024 issue: 18 year: 2019 ident: 7574_CR2 publication-title: Sensors doi: 10.3390/s19184024 – volume: 14 year: 2022 ident: 7574_CR43 publication-title: Remote Sens doi: 10.3390/rs14030592 – volume: 7 start-page: 287 issue: 5 year: 2023 ident: 7574_CR12 publication-title: Drones doi: 10.3390/drones7050287 – volume: 45 start-page: 87 year: 2022 ident: 7574_CR23 publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2022.3152247 – ident: 7574_CR29 doi: 10.1109/ICCV48922.2021.00041 – volume: 57 start-page: 7894 year: 2019 ident: 7574_CR11 publication-title: IEEE Trans Geosci Remote Sens doi: 10.1109/TGRS.2019.2917161 – volume: 145 start-page: 197 year: 2018 ident: 7574_CR5 publication-title: ISPRS J Photogramm Remote Sens doi: 10.1016/j.isprsjprs.2018.01.004 – volume: 60 start-page: 1 year: 2022 ident: 7574_CR36 publication-title: IEEE Trans Geosci Remote Sens – volume: 14 start-page: 2024 issue: 5 year: 2024 ident: 7574_CR1 publication-title: Appl Sci doi: 10.3390/app14052024 – volume: 13 start-page: 5521 issue: 9 year: 2023 ident: 7574_CR46 publication-title: Appl Sci doi: 10.3390/app13095521 – volume: 153 start-page: 461 year: 2022 ident: 7574_CR51 publication-title: Neural Netw doi: 10.1016/j.neunet.2022.06.038 – volume: 57 start-page: 6916 year: 2019 ident: 7574_CR57 publication-title: IEEE Trans Geosci Remote Sens doi: 10.1109/TGRS.2019.2909695 – volume: 149 year: 2022 ident: 7574_CR53 publication-title: Comput Biol Med – volume: 19 start-page: 1 year: 2022 ident: 7574_CR38 publication-title: IEEE Geosci Remote Sens Lett – ident: 7574_CR45 doi: 10.1109/ICCCR54399.2022.9790134 – volume: 11 start-page: 1127 issue: 5 year: 2023 ident: 7574_CR26 publication-title: Mathematics doi: 10.3390/math11051127 – volume: 15 start-page: 4804 issue: 19 year: 2023 ident: 7574_CR4 publication-title: Remote Sens doi: 10.3390/rs15194804 – volume: 23 start-page: 7050 issue: 16 year: 2023 ident: 7574_CR3 publication-title: Sensors doi: 10.3390/s23167050 – volume: 140 start-page: 104316 year: 2022 ident: 7574_CR52 publication-title: Autom. Constr. doi: 10.1016/j.autcon.2022.104316 – volume: 53 start-page: 2625 year: 2022 ident: 7574_CR33 publication-title: Injury doi: 10.1016/j.injury.2022.04.013 – ident: 7574_CR48 doi: 10.1145/3512732.3533582 – volume: 218 start-page: 2338 year: 2023 ident: 7574_CR41 publication-title: Procedia Comput Sci doi: 10.1016/j.procs.2023.01.209 – ident: 7574_CR49 doi: 10.5244/C.35.68 – volume: 55 start-page: 1 year: 2020 ident: 7574_CR22 publication-title: ACM Comput Surv (CSUR) doi: 10.1145/3530811 – volume: 16 start-page: 174 issue: 1 year: 2024 ident: 7574_CR20 publication-title: Remote Sens doi: 10.3390/rs16010174 – ident: 7574_CR34 doi: 10.1007/978-3-030-87237-3_5 – ident: 7574_CR16 – ident: 7574_CR10 – ident: 7574_CR55 doi: 10.1109/ICCVW54120.2021.00252 – ident: 7574_CR14 – volume: 20 start-page: 1999 issue: 7 year: 2020 ident: 7574_CR17 publication-title: Sensors doi: 10.3390/s20071999 – volume: 8 start-page: 33 year: 2022 ident: 7574_CR27 publication-title: Comput Vis Media doi: 10.1007/s41095-021-0247-3 – volume: 30 start-page: 5998 year: 2017 ident: 7574_CR8 publication-title: Adv Neural Inf Process Syst – volume: 66 year: 2021 ident: 7574_CR44 publication-title: Phys Med Biol – volume: 6 issue: 1 year: 2023 ident: 7574_CR25 publication-title: Vis Comput Ind Biomed Art doi: 10.1186/s42492-023-00140-9 – volume: 13 start-page: 936 year: 2023 ident: 7574_CR13 publication-title: Agriculture (Basel) doi: 10.3390/agriculture13050936 – volume: 58 start-page: 82 year: 2019 ident: 7574_CR18 publication-title: IEEE Trans Geosci Remote Sens doi: 10.1109/TGRS.2019.2931801 – volume: 10 start-page: 371 issue: 4 year: 2021 ident: 7574_CR19 publication-title: Electronics doi: 10.3390/electronics10040371 – volume: 3 start-page: 260 year: 2022 ident: 7574_CR32 publication-title: AI doi: 10.3390/ai3020016 – ident: 7574_CR42 doi: 10.1109/ICECA52323.2021.9676146 |
SSID | ssj0003307759 ssj0002793483 ssib051670015 |
Score | 2.3023243 |
SecondaryResourceType | review_article |
Snippet | Visual scene interpretation is a significant and daunting process of observing, exploring, and elaborating dynamic scenes. It provides reliable and safe... Abstract Visual scene interpretation is a significant and daunting process of observing, exploring, and elaborating dynamic scenes. It provides reliable and... |
SourceID | doaj proquest crossref springer |
SourceType | Open Website Aggregation Database Index Database Publisher |
StartPage | 932 |
SubjectTerms | Applied and Technical Physics Architecture Artificial neural networks Attention Chemistry/Food Science Computer vision Convolutional neural networks Datasets Deep learning Earth Sciences Engineering Environment Linear embedding Machine learning Materials Science Multi head attention Multimedia Neural networks Performance measurement Recognition Review Scene interpretation Semantics Vision transformers Visual observation |
SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LSwMxEA7SkxdRVFytkoM3XdxN8_QiVSzFgycLvYVkdwK9rNLW_-8ku1tbQbx4WtgXYWYyD_LNN4Rcs1AyWYc4JlVCjl4y5EZ5yH01YroUXrsWbfEqpzP-MhfzrVFfERPW0gO3grvTwnFXOS2hVhxCZUofnMa4x1XBeZ14PjHmbRVT0Qdjla6UMF2XTOqVi0d8LI_TWzFKKp6XO5EoEfbvZJk_DkZTvJkckoMuUaTjdoFHZA-aY-LHS6BtOzhd9yknJnB0CRFchT-iEUbemRN-Hukq0yWBvVd00dBI3wR0sQM2fLinY9p2sZyQ2eT57Wmad1MS8orhBsqVFxVjUAlmKlEZmSZESaNAy5F3BkxdGM_qsuaFFy5ExrwiMFRBGIHwWKKekkHz3sAZoYDlgiyli3Ugr4FhdaNMqR1I9Gsu6Izc9BKzHy0Zht3QHif5WpSvTfK1ZUYeo1A3b0Yi63QD1Ws79dq_1JuRYa8S2-2ulR3Fhly0Jc0yctur6fvx70s6_48lXZB9lswoAs2GZLBefsIlZiZrf5WM8AuLGN8I priority: 102 providerName: Directory of Open Access Journals – databaseName: Springer Nature OA Free Journals dbid: C6C link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZQWWBAPEWhIA9sENE4frKgUlFVDExU6mbZyVnqUlBb_j9nJym0goEpUnKJonvYd7677wi5YSFnsgpxTKqEDFfJkBnlIfNlwXQuvHZ1tcWrHE_4y1RMG5ic2Auzlb-_X8bMHMvi0FXc3BTPMNLZFXmh4piGoRyuz1MwLldKmKYv5vdXN_aeBNG_4VdupULTDjM6JAeNa0gHtSyPyA7Mj8n-D8DAE-IHC6B1OzhdtS4nOnB0AbG4CmloLCNv1Ak_FuEq0yUVey_pbE4jfBPQ2Uax4eMDHdC6i-WUTEbPb8Nx1kxJyEqGBpQpL0rGoBTMlKI0Mk2IkkaBloV3BkzVN55VecX7XrgQEfP6gaEIQgHCY4h6Rjrz9zmcEwoYLshcuhgH8goYRjfK5NqBxHXNBd0lty3_7EcNhmHXsMeJ2xa5bRO3bd4lT5HFa8oIZJ1uoHxtYxdWC8dd6bSESnEIpcl9cBrdGq76nFdFl_RaAdnGupa2iA25qEuadcldK7Tvx3__0sX_yC_JHkvqE0vKeqSzWnzCFfogK3-dlO8LbDjTVg priority: 102 providerName: Springer Nature |
Title | Are vision transformers replacing convolutional neural networks in scene interpretation?: A review |
URI | https://link.springer.com/article/10.1007/s42452-025-07574-1 https://www.proquest.com/docview/3239932882 https://doaj.org/article/85a4aca86ed74efc91bfa874347044d3 |
Volume | 7 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1RbxMxDLZY-wJICAaIjlHlgTeI6OWSXMLLdCsrE4IJAZP2FiW5BI2HdmvL_8fJ5bYVCZ5OyuWik-04dmx_BnjNYsVkF1ObVBkoaslIdeMCdb5mqhJO2T7b4kyenvNPF-KiXLhtSlrloBOzou5WPt2Rv6tTESZ-r9jR1TVNXaNSdLW00NiDMapgpUYwPj45-_ptkChRpSqUcuD9ymE2XfOMzcnQA6M1il-ppMn1dCkMyGjq8IonacNptXNaZVD_HUv0r-BpPpMWj-FRMSZJ23P_CdwLy3142N6JDezDgzuIg0_BtetA-npysh1sVrQAyTqk7CycQ1IeepFHXDvhXeZHzhbfkMslSfhPgVzuZCsevSct6ctgnsH54uTH_JSWNgvUM9yBtHHCMxa8YNoLr2VuMSV1E5SsndVBdzPtWFd1fOaEjQlybxYZ8jDWQTj0cZ_DaLlahhdAAvobspI2OZK8Cwzdo0ZXygaJitFGNYE3AznNVY-mYW5wkzPxDRLfZOKbagLHieI3MxMSdh5YrX-asrGMEpZbb5UMXcND9Lpy0Sq0i3gz47yrJ3A48MuU7bkxt8I0gbcDD29f__uXDv6_2ku4z7L0pBy0Qxht17_DKzRatm4Ke2rxcQrj9sOXz9-nRU5xdC7n03wR8Ac7sOq2 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwEB5V5cBDQlBAXWjBBziBxdqxHRupqpbHsqWlp1bqzdiJg8pht-wuQv1T_Y3MOEnbRYJbT5ESx7LGn-fheQG8lI2Qpm6oTapJHLlkw10ZE49VIa3Q0YY22uLQTI7VlxN9sgYXfS4MhVX2PDEz6npW0R3524KSMPF_K3fPfnLqGkXe1b6FRguL_XT-G022xc7eR9zfV1KOPx19mPCuqwCvJAKOl1FXUqZKS1fpypncUcm4MllTxOCSq4cuylrUahh1aKjC3LCRuOSmSDoKugBFln9LFSjJKTN9_LnHrxaU89KJ1x_ZqecKlSuBSrT3eIFg7_J2cvYeOR0lp36yKLdLxcWKbMwtBFb03r9ctVkCjh_A_U51ZaMWaw9hLU034N7omidiA-5eq2_4COJonlibvc6WvYaM-iabJ4oFwzGMot479OPcVF0zP3Js-oKdThlVm0rsdCU2cvcdG7E26eYxHN8I-Z_A-nQ2TZvAElo3RphAZquqk0RjrHTChmSQDYfGDuB1T05_1tbu8JdVmjPxPRLfZ-J7MYD3RPHLkVR3O7-Yzb_77hh7q4MKVbAm1aVKTeVEbIJFLUyVQ6XqYgBb_X75jhks_BV0B_Cm38Orz_9e0tP_z_YCbk-Ovh74g73D_WdwR2YkUfTbFqwv57_SNqpLy_g8Y5TBt5s-FH8AS-Qhmw |
linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtR1NaxQx9FG2ICoUrUq3rTUHPWnoJpPJTAQp24-ltbIUsdBbTGYSqYfdurtF_Gv-Ot_LzLRdQW89DcxkQnjfL-8L4LWMQuo60phUHThKychN4QP3VSZLkfvSNdkWY318rj5e5Bcr8LurhaG0yk4mJkFdTyu6I9_NqAgT_y_lbmzTIs4OR3tXPzhNkKJIazdOoyGR0_DrJ7pv8w8nh4jrN1KOjr4cHPN2wgCvJBIfL3xeSRmqXJoqr4xO05W0KUKpM-9MMPXAeFmLWg187iJ1mxtEicePWci9oMtQFP-rBXlFPVjdPxqffe6oORdUAdMq2-8pxGcylfqCSvT-eIak31bxpFo-CkFKTtNlUYsXioslTZkGCixZwX8FbpM-HD2BtdaQZcOG8p7CSpisw-PhnbjEOjy60-3wGfjhLLCmlp0tOnsZrU82C5QZhmsY5cC3vIB7U6_N9EiZ6nN2OWHUeyqwy6VMyb33bMiaEpzncH4vCHgBvcl0EjaABfR1tNCOnFhVB4muWWFE6YJGoexi2Ye3HTjtVdPJw970bE7Atwh8m4BvRR_2CeI3K6kLd3oxnX2zLVPbMnfKVa7UoS5UiJURProSbTJVDJSqsz5sd_iyrWiY21tC7sO7Doe3n_99pM3_7_YKHiBD2E8n49MteCgTIVEq3Db0FrPr8BJtp4XfaYmUwdf75os_3O4nLQ |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Are+vision+transformers+replacing+convolutional+neural+networks+in+scene+interpretation%3F%3A+A+review&rft.jtitle=SN+applied+sciences&rft.au=Rosy%2C+N.+Arockia&rft.au=Balasubadra%2C+K&rft.au=Deepa%2C+K&rft.date=2025-09-01&rft.pub=Springer+Nature+B.V&rft.issn=2523-3963&rft.eissn=2523-3971&rft.volume=7&rft.issue=9&rft.spage=932&rft_id=info:doi/10.1007%2Fs42452-025-07574-1&rft.externalDBID=HAS_PDF_LINK |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=3004-9261&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=3004-9261&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=3004-9261&client=summon |