Prompt Distribution Learning

We present prompt distribution learning for effectively adapting a pre-trained vision-language model to address downstream recognition tasks. Our method not only learns low-bias prompts from a few samples but also captures the distribution of diverse prompts to handle the varying visual representati...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 5196 - 5205
Main Authors	Lu, Yuning, Liu, Jianzhuang, Zhang, Yonggang, Liu, Yajing, Tian, Xinmei
Format	Conference Proceeding
Language	English
Published	IEEE 01.06.2022
Subjects	Adaptation models Computer vision Gaussian distribution Pattern recognition Task analysis Training Vision + language; Transfer/low-shot/long-tail learning Visualization
Online Access	Get full text
ISSN	1063-6919
DOI	10.1109/CVPR52688.2022.00514

Cover

Loading…

More Information
Summary:	We present prompt distribution learning for effectively adapting a pre-trained vision-language model to address downstream recognition tasks. Our method not only learns low-bias prompts from a few samples but also captures the distribution of diverse prompts to handle the varying visual representations. In this way, we provide high-quality task-related content for facilitating recognition. This prompt distribution learning is realized by an efficient approach that learns the output embeddings of prompts instead of the input embeddings. Thus, we can employ a Gaussian distribution to model them effectively and derive a surrogate loss for efficient training. Extensive experiments on 12 datasets demonstrate that our method consistently and significantly outperforms existing methods. For example, with 1 sample per category, it relatively improves the average result by 9.1% compared to human-crafted prompts.
ISSN:	1063-6919
DOI:	10.1109/CVPR52688.2022.00514