[arXiv]score: 0.67
Towards Fine-Grained Robustness: Attention-Guided Test-Time Prompt Tuning for Vision-Language Models
May 19, 2026
Attention-guided test-time prompt tuning improves adversarial robustness of Vision-Language Models like CLIP by identifying semantic information and preserving discriminative regions in fine-grained scenarios without multi-view augmentation.
cs.CV