Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models
Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous prompts using task-specifi...
Saved in:
Published in | IEEE transactions on circuits and systems for video technology Vol. 33; no. 9; p. 1 |
---|---|
Main Authors | , , , , , |
Format | Journal Article |
Language | English |
Published |
New York
IEEE
01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous prompts using task-specific training data. Despite the performance improvements on downstream tasks, several studies have reported that CoOp suffers from the overfitting issue in two aspects: (i) the test accuracy on base classes first improves and then worsens during training;(ii) the test accuracy on novel classes keeps decreasing. However, none of the existing studies can understand and mitigate such overfitting problems. In this study, we first explore the cause of overfitting by analyzing the gradient flow. Comparative experiments reveal that CoOp favors generalizable and spurious features in the early and later training stages, respectively, leading to the non-overfitting and overfitting phenomena. Given those observations, we propose Subspace Prompt Tuning ( Sub PT) to project the gradients in back-propagation onto the low-rank subspace spanned by the early-stage gradient flow eigenvectors during the entire training process and successfully eliminate the overfitting problem. In addition, we equip CoOp with a Novel Feature Learner (NFL) to enhance the generalization ability of the learned prompts onto novel categories beyond the training set, needless of image training data. Extensive experiments on 11 classification datasets demonstrate that Sub PT+NFL consistently boost the performance of CoOp and outperform the state-of-the-art CoCoOp approach. Experiments on more challenging vision downstream tasks, including open-vocabulary object detection and zero-shot semantic segmentation, also verify the effectiveness of the proposed method. Codes can be found at https://tinyurl.com/mpe64f89. |
---|---|
AbstractList | Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous prompts using task-specific training data. Despite the performance improvements on downstream tasks, several studies have reported that CoOp suffers from the overfitting issue in two aspects: (i) the test accuracy on base classes first improves and then worsens during training; (ii) the test accuracy on novel classes keeps decreasing. However, none of the existing studies can understand and mitigate such overfitting problems. In this study, we first explore the cause of overfitting by analyzing the gradient flow. Comparative experiments reveal that CoOp favors generalizable and spurious features in the early and later training stages, respectively, leading to the non-overfitting and overfitting phenomena. Given those observations, we propose Subspace Prompt Tuning (Sub PT) to project the gradients in back-propagation onto the low-rank subspace spanned by the early-stage gradient flow eigenvectors during the entire training process and successfully eliminate the overfitting problem. In addition, we equip CoOp with a Novel Feature Learner (NFL) to enhance the generalization ability of the learned prompts onto novel categories beyond the training set, needless of image training data. Extensive experiments on 11 classification datasets demonstrate that Sub PT+NFL consistently boost the performance of CoOp and outperform the state-of-the-art CoCoOp approach. Experiments on more challenging vision downstream tasks, including open-vocabulary object detection and zero-shot semantic segmentation, also verify the effectiveness of the proposed method. Codes can be found at https://tinyurl.com/mpe64f89 . |
Author | Dong, Weiming Ma, Chengcheng Liu, Yang Xie, Lingxi Deng, Jiankang Xu, Changsheng |
Author_xml | – sequence: 1 givenname: Chengcheng orcidid: 0000-0002-0502-3960 surname: Ma fullname: Ma, Chengcheng organization: National Lab of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China – sequence: 2 givenname: Yang surname: Liu fullname: Liu, Yang organization: Alibaba DAMO Academy, Hangzhou, China – sequence: 3 givenname: Jiankang orcidid: 0000-0002-3709-6216 surname: Deng fullname: Deng, Jiankang organization: Huawei Inc, Shenzhen, China – sequence: 4 givenname: Lingxi orcidid: 0000-0003-4831-9451 surname: Xie fullname: Xie, Lingxi organization: Huawei Inc, Shenzhen, China – sequence: 5 givenname: Weiming orcidid: 0000-0001-6502-145X surname: Dong fullname: Dong, Weiming organization: National Lab of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China – sequence: 6 givenname: Changsheng orcidid: 0000-0001-8343-9665 surname: Xu fullname: Xu, Changsheng organization: National Lab of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing, China |
BookMark | eNp9kE1LAzEQhoNUsK3-AfGw4HlrMptkd49S_IKWCm57XdLdZElpk5qkgv_ebNuDePD0zgzzzMc7QgNjjUToluAJIbh8qKYfq2oCGLJJBpSxgl6gIYmaAmA2iDFmJC2AsCs08n6DMaEFzYeoWppWOh-EabXpkijJXAfdidCniy_plA7HWJvk3dndPiTVwfQFZV2y0l5bk86E6Q6ik8nctnLrr9GlElsvb846Rsvnp2r6ms4WL2_Tx1naQMlDKnBDJVdrrjLSkrxkgrYCIM_WLTRlnrUFESxrVakw8KbJIWck3t1ALDGQ62yM7k9z985-HqQP9cYenIkrayg4oTwvizJ2FaeuxlnvnVR1o0P8z5rghN7WBNe9h_XRw7r3sD57GFH4g-6d3gn3_T90d4K0lPIXgCnjnGY_Y6iAGw |
CODEN | ITCTEM |
CitedBy_id | crossref_primary_10_1109_TCSVT_2024_3401451 crossref_primary_10_48084_etasr_8455 crossref_primary_10_1109_TCSVT_2024_3454227 crossref_primary_10_1109_TCSVT_2023_3343495 crossref_primary_10_1109_TMM_2024_3521702 crossref_primary_10_1109_TMM_2024_3413318 crossref_primary_10_3390_math12213359 crossref_primary_10_1109_TCSVT_2024_3462100 crossref_primary_10_1016_j_knosys_2024_111790 crossref_primary_10_1007_s00530_024_01373_1 crossref_primary_10_1109_TCSVT_2024_3489024 crossref_primary_10_1109_TCSVT_2024_3366935 crossref_primary_10_1016_j_neucom_2024_128421 crossref_primary_10_1109_TCSVT_2024_3383914 crossref_primary_10_1016_j_knosys_2024_112358 crossref_primary_10_1016_j_neunet_2025_107168 crossref_primary_10_1007_s10462_024_10915_y crossref_primary_10_1109_TCSVT_2024_3475510 crossref_primary_10_1016_j_patcog_2023_110096 crossref_primary_10_1109_TCSVT_2023_3327605 crossref_primary_10_1093_jamia_ocae325 crossref_primary_10_1109_TIM_2024_3485403 crossref_primary_10_1109_TPAMI_2023_3346405 |
Cites_doi | 10.1109/tpami.2022.3195549 10.1109/CVPR52688.2022.01631 10.1007/s13373-017-0101-1 10.1109/TCSVT.2020.3038720 10.18653/v1/2021.acllong.353 10.1109/CVPR.2009.5206848 10.1007/s11263-022-01653-1 10.1109/CVPR.2004.383 10.1109/CVPR46437.2021.01501 10.1109/TCSVT.2021.3137430 10.1109/CVPR52688.2022.01503 10.1109/TCSVT.2022.3152615 10.18653/v1/2022.emnlp-main.763 10.1109/CVPR52688.2022.00514 10.1109/tkde.2022.3178128 10.1109/TCSVT.2021.3088545 10.1109/CVPR52688.2022.01512 10.1109/CVPR52688.2022.01759 10.1109/ICCV.2017.322 10.1109/CVPR.2010.5539970 10.48550/ARXIV.1212.0402 10.1109/TIP.2022.3169693 10.1109/CVPR.2014.461 10.1109/ICVGIP.2008.47 10.1109/JSTARS.2019.2918242 10.1007/978-3-031-19827-4_41 10.18653/v1/2021.emnlp-main.243 10.1109/TCSVT.2020.2995754 10.1109/TPAMI.2022.3178101 10.1109/CVPR.2019.00550 10.1109/ICCV48922.2021.00823 10.1109/TCSVT.2020.3039522 10.1109/TCSVT.2019.2947482 10.1007/978-3-319-10599-4_29 10.1162/neco.1996.8.7.1341 10.1109/CVPR.2012.6248092 10.1109/ICCVW.2013.77 10.1109/CVPR.2016.90 10.1109/TPAMI.2020.2981604 10.1109/CVPR.2018.00132 |
ContentType | Journal Article |
Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
DOI | 10.1109/TCSVT.2023.3245584 |
DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
DatabaseTitleList | Technology Research Database |
Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering |
EISSN | 1558-2205 |
EndPage | 1 |
ExternalDocumentID | 10_1109_TCSVT_2023_3245584 10045664 |
Genre | orig-research |
GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61832016; U20B2070 funderid: 10.13039/501100001809 |
GroupedDBID | -~X 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS HZ~ IFIPE IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS RXW TAE TN5 5VS AAYXX AETIX AGSQL AI. AIBXA ALLEH CITATION EJD H~9 ICLAB IFJZH RIG VH1 7SC 7SP 8FD JQ2 L7M L~C L~D |
ID | FETCH-LOGICAL-c296t-a0c4e6fb6f31d1795a4da2273bd2c973d81a53df9f026cc72751014c2df952eb3 |
IEDL.DBID | RIE |
ISSN | 1051-8215 |
IngestDate | Mon Jun 30 08:20:03 EDT 2025 Thu Apr 24 23:03:48 EDT 2025 Tue Jul 01 00:41:20 EDT 2025 Wed Aug 27 02:21:21 EDT 2025 |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 9 |
Language | English |
License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c296t-a0c4e6fb6f31d1795a4da2273bd2c973d81a53df9f026cc72751014c2df952eb3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
ORCID | 0000-0001-8343-9665 0000-0001-6502-145X 0000-0002-3709-6216 0000-0003-4831-9451 0000-0002-0502-3960 |
PQID | 2861467989 |
PQPubID | 85433 |
PageCount | 1 |
ParticipantIDs | crossref_citationtrail_10_1109_TCSVT_2023_3245584 proquest_journals_2861467989 ieee_primary_10045664 crossref_primary_10_1109_TCSVT_2023_3245584 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2023-09-01 |
PublicationDateYYYYMMDD | 2023-09-01 |
PublicationDate_xml | – month: 09 year: 2023 text: 2023-09-01 day: 01 |
PublicationDecade | 2020 |
PublicationPlace | New York |
PublicationPlace_xml | – name: New York |
PublicationTitle | IEEE transactions on circuits and systems for video technology |
PublicationTitleAbbrev | TCSVT |
PublicationYear | 2023 |
Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
References | ref57 ref12 ref56 ref15 ref14 ref52 ref55 ref54 Feng (ref62) 2022 ref17 ref16 ref18 Koh (ref31) Liu (ref42); 30 Yao (ref28) 2021 ref51 ref50 Xing (ref24) 2022 ref45 Tsimpoukelli (ref19); 34 ref48 ref41 ref49 ref8 ref7 ref9 ref4 Lee (ref46); 32 ref6 ref5 Bommasani (ref3) 2021 ref40 Maji (ref53) 2013 ref35 ref34 ref37 Wang (ref59); 32 ref33 Larsen (ref47) ref32 Radford (ref1) Recht (ref58) ref39 Liutkus (ref44) ref38 Cheng (ref67); 34 Yu (ref26); 33 Zhang (ref30) 2022 Zhang (ref27) 2021 Xu (ref65) 2021 Wang (ref20) ref23 ref25 ref64 ref63 ref22 ref66 Zhu (ref10) 2022 ref29 Pezeshki (ref13); 34 ref60 Arbel (ref43); 32 Derakhshani (ref11) 2022 Liang (ref21) 2022 Yang (ref36) ref61 Jia (ref2) |
References_xml | – ident: ref34 doi: 10.1109/tpami.2022.3195549 – ident: ref9 doi: 10.1109/CVPR52688.2022.01631 – ident: ref41 doi: 10.1007/s13373-017-0101-1 – volume: 34 start-page: 1256 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref13 article-title: Gradient starvation: A learning proclivity in neural networks – year: 2022 ident: ref21 article-title: Local-global context aware transformer for language-guided video segmentation publication-title: arXiv:2203.09773 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref47 article-title: How many degrees of freedom do we need to train deep networks: A loss landscape perspective – volume: 32 start-page: 1 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref59 article-title: Learning robust global representations by penalizing local predictive power – year: 2022 ident: ref10 article-title: Prompt-aligned gradient for prompt tuning publication-title: arXiv:2205.14865 – ident: ref15 doi: 10.1109/TCSVT.2020.3038720 – ident: ref7 doi: 10.18653/v1/2021.acllong.353 – ident: ref48 doi: 10.1109/CVPR.2009.5206848 – start-page: 5389 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref58 article-title: Do ImageNet classifiers generalize to ImageNet? – start-page: 4904 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref2 article-title: Scaling up visual and vision-language representation learning with noisy text supervision – ident: ref8 doi: 10.1007/s11263-022-01653-1 – year: 2021 ident: ref28 article-title: CPT: Colorful prompt tuning for pre-trained vision-language models publication-title: arXiv:2109.11797 – volume: 34 start-page: 17864 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref67 article-title: Per-pixel classification is not all you need for semantic segmentation – year: 2022 ident: ref24 article-title: Class-aware visual prompt tuning for vision-language pre-trained model publication-title: arXiv:2208.08340 – ident: ref40 doi: 10.1109/CVPR.2004.383 – ident: ref60 doi: 10.1109/CVPR46437.2021.01501 – ident: ref4 doi: 10.1109/TCSVT.2021.3137430 – volume: 30 start-page: 1 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref42 article-title: Stein variational gradient descent as gradient flow – year: 2022 ident: ref62 article-title: PromptDet: Expand your detector vocabulary with uncurated images publication-title: arXiv:2203.16513 – ident: ref22 doi: 10.1109/CVPR52688.2022.01503 – ident: ref35 doi: 10.1109/TCSVT.2022.3152615 – year: 2013 ident: ref53 article-title: Fine-grained visual classification of aircraft publication-title: arXiv:1306.5151 – ident: ref25 doi: 10.18653/v1/2022.emnlp-main.763 – ident: ref23 doi: 10.1109/CVPR52688.2022.00514 – year: 2021 ident: ref27 article-title: Tip-adapter: Training-free CLIP-adapter for better vision-language modeling publication-title: arXiv:2111.03930 – volume: 32 start-page: 8572 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref46 article-title: Wide neural networks of any depth evolve as linear models under gradient descent – year: 2022 ident: ref11 article-title: Variational prompt tuning improves generalization of vision-language models publication-title: arXiv:2210.02390 – ident: ref33 doi: 10.1109/tkde.2022.3178128 – ident: ref38 doi: 10.1109/TCSVT.2021.3088545 – year: 2021 ident: ref65 article-title: A simple baseline for open-vocabulary semantic segmentation with pre-trained vision-language model publication-title: arXiv:2112.14757 – start-page: 8748 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref1 article-title: Learning transferable visual models from natural language supervision – volume: 33 start-page: 5824 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref26 article-title: Gradient surgery for multi-task learning – ident: ref18 doi: 10.1109/CVPR52688.2022.01512 – start-page: 5637 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref31 article-title: WILDS: A benchmark of in-the-wild distribution shifts – ident: ref17 doi: 10.1109/CVPR52688.2022.01759 – year: 2022 ident: ref30 article-title: Neural prompt search publication-title: arXiv:2206.04673 – ident: ref64 doi: 10.1109/ICCV.2017.322 – volume: 32 start-page: 1 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref43 article-title: Maximum mean discrepancy gradient flow – ident: ref54 doi: 10.1109/CVPR.2010.5539970 – ident: ref57 doi: 10.48550/ARXIV.1212.0402 – ident: ref16 doi: 10.1109/TIP.2022.3169693 – volume: 34 start-page: 200 volume-title: Proc. Adv. Neural Inf. Process. Syst. ident: ref19 article-title: Multimodal few-shot learning with frozen language models – ident: ref55 doi: 10.1109/CVPR.2014.461 – ident: ref51 doi: 10.1109/ICVGIP.2008.47 – ident: ref56 doi: 10.1109/JSTARS.2019.2918242 – ident: ref29 doi: 10.1007/978-3-031-19827-4_41 – ident: ref6 doi: 10.18653/v1/2021.emnlp-main.243 – ident: ref37 doi: 10.1109/TCSVT.2020.2995754 – ident: ref45 doi: 10.1109/TPAMI.2022.3178101 – ident: ref63 doi: 10.1109/CVPR.2019.00550 – year: 2021 ident: ref3 article-title: On the opportunities and risks of foundation models publication-title: arXiv:2108.07258 – ident: ref61 doi: 10.1109/ICCV48922.2021.00823 – ident: ref5 doi: 10.1109/TCSVT.2020.3039522 – ident: ref14 doi: 10.1109/TCSVT.2019.2947482 – ident: ref52 doi: 10.1007/978-3-319-10599-4_29 – ident: ref12 doi: 10.1162/neco.1996.8.7.1341 – ident: ref49 doi: 10.1109/CVPR.2012.6248092 – ident: ref50 doi: 10.1109/ICCVW.2013.77 – start-page: 1 volume-title: Proc. Int. Conf. Learn. Represent. ident: ref36 article-title: Free lunch for few-shot learning: Distribution calibration – ident: ref39 doi: 10.1109/CVPR.2016.90 – ident: ref32 doi: 10.1109/TPAMI.2020.2981604 – ident: ref66 doi: 10.1109/CVPR.2018.00132 – start-page: 4104 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref44 article-title: Sliced-Wasserstein flows: Nonparametric generative modeling via optimal transport and diffusions – start-page: 23318 volume-title: Proc. Int. Conf. Mach. Learn. ident: ref20 article-title: OFA: Unifying architectures, tasks, and modalities through a simple sequence-to-sequence learning framework |
SSID | ssj0014847 |
Score | 2.619536 |
Snippet | Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts.... |
SourceID | proquest crossref ieee |
SourceType | Aggregation Database Enrichment Source Index Database Publisher |
StartPage | 1 |
SubjectTerms | Back propagation Data models Eigenvectors Gradient flow gradient projection Image segmentation Object recognition Optimization overfitting prompt tuning Semantic segmentation subspace learning Task analysis Training Training data Tuning Vision-language model Visualization |
Title | Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models |
URI | https://ieeexplore.ieee.org/document/10045664 https://www.proquest.com/docview/2861467989 |
Volume | 33 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFA-6kx78nDidkoM3SdePNG2OMhxD3BTsxm4lTVIYjk22zoN_vS9pO6aieOoHSRvyXvJ-7-V9IHQjokgxT2nCdaQI5dInPOYRCbmKgJ8zHVuX_8GQ9Uf0YRJOqmB1GwujtbbOZ9oxt_YsXy3k2pjKOp4FIIzuol3Q3Mpgrc2RAY1tNTHACx6JQZDVETIu7yTdl3HimELhDuCHMIzpFylky6r82IutgOkdomE9tNKv5NVZF5kjP75lbfz32I_QQQU18V3JG8doR89P0P5WAsJTlIy2Y1swXPBgWmbdgMend1PC2jpG4-kcPy9h7yhwsjamFAxgF49tYDp5rGye2BRWm62aaNS7T7p9UtVZINLnrCDClVSzPGN54ClYoKGgSviAazLlSx4FKvZEGKic56CwSQmIxyxkKn14FfqgjZ-hxnwx1-cICwFwJw-YOQ2lgrtZACzCGfxHaBkEWQt59bynskpCbmphzFKrjLg8tbRKDa3SilYtdLvp81am4PizddNM_lbLct5bqF3TN62W6Sr1Y2YkBXDnxS_dLtGe-XrpVdZGjWK51lcAQ4rs2rLfJ9-U2HE |
linkProvider | IEEE |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwzV1NT9swGH5V2GFwYB-AVsaYD9sJOTSO48QHDlM3VEbLkEgRt-DYjoSGCqIpaPsv-yv8tr120qoMwQ1pp3zIjhX7sd_ntd8PgE8qSYwIjaXSJoZyqRmVqUxoLE2CeC5s6k3-B4eiN-TfT-PTFvyZ-cJYa73xmQ3crT_LN5d64rbKdkJPQARvbCgP7K9b1NDGu_tfcTg_M7b3Lev2aJNEgGomRUVVR3MrykKUUWgQfbHiRjEU2oVhWiaRSUMVR6aUJWojWqM4dyjlmuGrmKGqid9dgBdINGJWu4fNDil46vOXIUMJaYqic-qT05E7Wff4JAtcavIAGUscp_ye3POJXB6s_l6k7b2Cu2ln1JYsP4NJVQT69z9xIv_b3noNKw2ZJl9q9L-Blh29heW5EIurkA3nvXcIXsjgvI4rgo8_blySbm_6Tc5H5OgaV8eKZBO3WUSQzpMT73pP-82uLnGp4y7GazB8lt9ah8XR5ci-A6IUEroyEu68lyvZKSKcBFJgO8rqKCraEE7HOddNmHWX7eMi9-pWR-YeG7nDRt5gow3bszpXdZCRJ0uvucGeK1mPcxs2p3jKm4VonLNUOFmI82_jkWof4WUvG_Tz_v7hwXtYci3VNnSbsFhdT-wHJF1VseWhT-DsudHzF5DTNiY |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Understanding+and+Mitigating+Overfitting+in+Prompt+Tuning+for+Vision-Language+Models&rft.jtitle=IEEE+transactions+on+circuits+and+systems+for+video+technology&rft.au=Ma%2C+Chengcheng&rft.au=Liu%2C+Yang&rft.au=Deng%2C+Jiankang&rft.au=Xie%2C+Lingxi&rft.date=2023-09-01&rft.pub=IEEE&rft.issn=1051-8215&rft.spage=1&rft.epage=1&rft_id=info:doi/10.1109%2FTCSVT.2023.3245584&rft.externalDocID=10045664 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1051-8215&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1051-8215&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1051-8215&client=summon |