●builderYou can potentially improve image retrieval quality without retraining the MLLM — the frozen model acts as a supervision source, reducing compute overhead.
●researcherWorth watching because attribute-aware training signals via GRPO offer a concrete alternative to contrastive scalar loss, with implications for retrieval and fine-grained visual understanding benchmarks.