Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot Generalization
Prompt tuning is a parameter-efficient method, which learns soft prompts and conditions frozen language models to perform specific downstream tasks. Though effective, prompt tuning under few-shot settings on the one hand heavily relies on a good initialization of soft prompts. On the other hand, it...
Saved in:
Published in | arXiv.org |
---|---|
Main Authors | , , , , , |
Format | Paper |
Language | English |
Published |
Ithaca
Cornell University Library, arXiv.org
23.10.2023
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Abstract | Prompt tuning is a parameter-efficient method, which learns soft prompts and conditions frozen language models to perform specific downstream tasks. Though effective, prompt tuning under few-shot settings on the one hand heavily relies on a good initialization of soft prompts. On the other hand, it can easily overfit to few-shot training samples, thereby undermining generalizability. Existing works leverage pre-training or supervised meta-learning to initialize soft prompts but they fail to data-efficiently generalize to unseen downstream tasks. To address the above problems, this paper proposes a novel Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER). SUPMER leverages self-supervised meta-learning with a diverse set of well-designed meta-training tasks to learn a universal prompt initialization for efficient adaptation using only unlabeled data. Additionally, it jointly meta-learns a gradient regularization function to transform raw gradients into a domain-generalizable direction, thus alleviating the problem of overfitting. Extensive experiments show that SUPMER achieves better performance for different few-shot downstream tasks, and also exhibits a stronger domain generalization ability. The code for SUPMER will be available at https://github.com/beepkh/SUPMER. |
---|---|
AbstractList | Prompt tuning is a parameter-efficient method, which learns soft prompts and conditions frozen language models to perform specific downstream tasks. Though effective, prompt tuning under few-shot settings on the one hand heavily relies on a good initialization of soft prompts. On the other hand, it can easily overfit to few-shot training samples, thereby undermining generalizability. Existing works leverage pre-training or supervised meta-learning to initialize soft prompts but they fail to data-efficiently generalize to unseen downstream tasks. To address the above problems, this paper proposes a novel Self-sUpervised meta-Prompt learning framework with MEta-gradient Regularization for few-shot generalization (SUPMER). SUPMER leverages self-supervised meta-learning with a diverse set of well-designed meta-training tasks to learn a universal prompt initialization for efficient adaptation using only unlabeled data. Additionally, it jointly meta-learns a gradient regularization function to transform raw gradients into a domain-generalizable direction, thus alleviating the problem of overfitting. Extensive experiments show that SUPMER achieves better performance for different few-shot downstream tasks, and also exhibits a stronger domain generalization ability. The code for SUPMER will be available at https://github.com/beepkh/SUPMER. |
Author | Pan, Kaihang Tang, Siliang Song, Hongye Lin, Jun Li, Juncheng Liu, Xiaozhong |
Author_xml | – sequence: 1 givenname: Kaihang surname: Pan fullname: Pan, Kaihang – sequence: 2 givenname: Juncheng surname: Li fullname: Li, Juncheng – sequence: 3 givenname: Hongye surname: Song fullname: Song, Hongye – sequence: 4 givenname: Jun surname: Lin fullname: Lin, Jun – sequence: 5 givenname: Xiaozhong surname: Liu fullname: Liu, Xiaozhong – sequence: 6 givenname: Siliang surname: Tang fullname: Tang, Siliang |
BookMark | eNqNyk8LgjAYgPERBVn5HQadBzrzT-dIOxREdZeFrzqxzd7NhD59QX6ATs_h9yzIVGkFE-LwIPBZsuF8TlxjGs_zeBTzMAwccr9CWzLTd4AvaaCgJ7CCnVE_OkuPIFBJVdFB2vonGYpCgrL0AlXfCpRvYaVWtNRIUxiYqbWlGShA0Y62IrNStAbcsUuyTve33YF1qJ89GJs3ukf1pZzHyTaJ4sSPg_-uD0kpSAg |
ContentType | Paper |
Copyright | 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
Copyright_xml | – notice: 2023. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
DBID | 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO HCIFZ L6V M7S PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
DatabaseName | ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One Community College ProQuest Central Korea SciTech Premium Collection ProQuest Engineering Collection Engineering Database Publicly Available Content Database ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
DatabaseTitle | Publicly Available Content Database Engineering Database Technology Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central ProQuest Engineering Collection ProQuest One Academic UKI Edition ProQuest Central Korea Materials Science & Engineering Collection ProQuest One Academic Engineering Collection |
DatabaseTitleList | Publicly Available Content Database |
Database_xml | – sequence: 1 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
Genre | Working Paper/Pre-Print |
GroupedDBID | 8FE 8FG ABJCF ABUWG AFKRA ALMA_UNASSIGNED_HOLDINGS AZQEC BENPR BGLVJ CCPQU DWQXO FRJ HCIFZ L6V M7S M~E PIMPY PQEST PQQKQ PQUKI PRINS PTHSS |
ID | FETCH-proquest_journals_27898678173 |
IEDL.DBID | BENPR |
IngestDate | Wed Sep 25 03:07:01 EDT 2024 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
LinkModel | DirectLink |
MergedId | FETCHMERGED-proquest_journals_27898678173 |
OpenAccessLink | https://www.proquest.com/docview/2789867817/abstract/?pq-origsite=%requestingapplication% |
PQID | 2789867817 |
PQPubID | 2050157 |
ParticipantIDs | proquest_journals_2789867817 |
PublicationCentury | 2000 |
PublicationDate | 20231023 |
PublicationDateYYYYMMDD | 2023-10-23 |
PublicationDate_xml | – month: 10 year: 2023 text: 20231023 day: 23 |
PublicationDecade | 2020 |
PublicationPlace | Ithaca |
PublicationPlace_xml | – name: Ithaca |
PublicationTitle | arXiv.org |
PublicationYear | 2023 |
Publisher | Cornell University Library, arXiv.org |
Publisher_xml | – name: Cornell University Library, arXiv.org |
SSID | ssj0002672553 |
Score | 3.4941452 |
SecondaryResourceType | preprint |
Snippet | Prompt tuning is a parameter-efficient method, which learns soft prompts and conditions frozen language models to perform specific downstream tasks. Though... |
SourceID | proquest |
SourceType | Aggregation Database |
SubjectTerms | Domains Learning Regularization Regularization methods Training |
Title | Self-supervised Meta-Prompt Learning with Meta-Gradient Regularization for Few-shot Generalization |
URI | https://www.proquest.com/docview/2789867817/abstract/ |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV3NS8MwFH9sK4I3P_FjjoBew_q1tp4EpV0ROspU2G0kaaoHaWub4c2_3bya6UHYMTwIScj75X398gBuXCFnoSc4MnO1g8KdgEaC-VQrB5e3sihdgdzhbBGkL_7jarYaQLrlwmBZ5RYTe6AuaoEx8ikyNiONrE44ZRyjAEJN75oPiv2jMM9qmmkMwXIdHxO21n28yJe_8RY3CLX17P2D3P4dSQ7Aylkj20MYyOoI9vryS9EdA3-S7yXtNg1qbicLkknFaN5qZVXEfIH6SjBm-iOZt32lliLLvpV8a8iURFugJJGftHurFTE_ShvZCVwn8fNDSrcLW5tr1K3_Nu2dwqiqK3kGxC59Hrmhtpu0LcNswbVPxopSoOoFwmbnMN4108Vu8SXsY0d1hGfXG8NItRt5pd9dxScwjJL5xBysHmVf8TclU5AW |
link.rule.ids | 786,790,12792,21416,33408,33779,43635,43840 |
linkProvider | ProQuest |
linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwfV1LS8QwEB60RfTmEx-rBvQa7Lbdtp4EZWvVbSnrCnsrSZquB2lrk8W_b1KzehD2PBCSkPnmkflmAK5dxkehx6hm5qoAhQ4DHDHiY6UclN_ysnKZ5g6nWZC8-c_z0dwk3IQpq1xhYg_UZcN0jvxGMzYjhazD8K79xHpqlP5dNSM0NsH2PRWqWGDfj7N8-ptlcYNQ-czeP6DtrUe8C3ZOWt7twQav92GrL7pk4gDoK_-osFi2Wl8FL1HKJcF5p1RUItP4dIF0pvRH8tj19VkSTfsB8p2hUCLld6KYf2Hx3khk-kgb2SFcxePZQ4JXGyvM4xHF31G9I7DqpubHgJzKp5EbKm9JeTDEYVRFYqSsmFa4gDnkBAbrVjpdL76E7WSWTorJU_ZyBjt6proGaNcbgCW7JT9XllfSC3O933W8jTI |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Self-supervised+Meta-Prompt+Learning+with+Meta-Gradient+Regularization+for+Few-shot+Generalization&rft.jtitle=arXiv.org&rft.au=Pan%2C+Kaihang&rft.au=Li%2C+Juncheng&rft.au=Song%2C+Hongye&rft.au=Lin%2C+Jun&rft.date=2023-10-23&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422 |