[arXiv]score: 0.53
Few-Shot CLIP Fine-Tuning Exacerbates Attention Sink, Hurting Cross-Domain Transfer
May 26, 2026
Standard few-shot fine-tuning of CLIP for cross-domain transfer significantly worsens the attention sink phenomenon by exploiting it as a shortcut for domain adaptation, reducing inter-class discriminability in the target domain.
cs.CV
HOW THIS AFFECTS YOU
●
builderWorth watching if you're fine-tuning CLIP-based models on small target-domain datasets, as standard procedures may degrade representation quality in ways not visible from accuracy alone.
●
researcherReframing attention sink as a domain adaptation shortcut provides a mechanistic explanation for CLIP fine-tuning failures in low-data cross-domain settings.