Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models

Pretrained vision-language models (VLMs) such as CLIP have shown impressive generalization capability in downstream vision tasks with appropriate text prompts. Instead of designing prompts manually, Context Optimization (CoOp) has been recently proposed to learn continuous prompts using task-specifi...

Full description

Saved in:
Bibliographic Details
Published inIEEE transactions on circuits and systems for video technology Vol. 33; no. 9; p. 1
Main Authors Ma, Chengcheng, Liu, Yang, Deng, Jiankang, Xie, Lingxi, Dong, Weiming, Xu, Changsheng
Format Journal Article
LanguageEnglish
Published New York IEEE 01.09.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects
Online AccessGet full text

Cover

Loading…