Unveiling Key Aspects of Fine-Tuning in Sentence Embeddings: A Representation Rank Analysis
The latest advancements in unsupervised learning of sentence embeddings predominantly involve employing contrastive learning-based (CL-based) fine-tuning over pre-trained language models. In this study, we analyze the latest sentence embedding methods by adopting representation rank as the primary t...
Saved in:
Main Authors | , , , , |
---|---|
Format | Journal Article |
Language | English |
Published |
18.05.2024
|
Subjects | |
Online Access | Get full text |
Cover
Loading…
Summary: | The latest advancements in unsupervised learning of sentence embeddings
predominantly involve employing contrastive learning-based (CL-based)
fine-tuning over pre-trained language models. In this study, we analyze the
latest sentence embedding methods by adopting representation rank as the
primary tool of analysis. We first define Phase 1 and Phase 2 of fine-tuning
based on when representation rank peaks. Utilizing these phases, we conduct a
thorough analysis and obtain essential findings across key aspects, including
alignment and uniformity, linguistic abilities, and correlation between
performance and rank. For instance, we find that the dynamics of the key
aspects can undergo significant changes as fine-tuning transitions from Phase 1
to Phase 2. Based on these findings, we experiment with a rank reduction (RR)
strategy that facilitates rapid and stable fine-tuning of the latest CL-based
methods. Through empirical investigations, we showcase the efficacy of RR in
enhancing the performance and stability of five state-of-the-art sentence
embedding methods. |
---|---|
DOI: | 10.48550/arxiv.2405.11297 |