[HUGGINGFACE]score: 0.84

The Unlearnability Phenomenon in RLVR for Language Models

May 15, 2026

RLVR exhibits an unlearnability phenomenon where hard examples remain unlearnable despite correct rollouts; cross-example gradient analysis reveals optimization and sampling techniques fail to resolve this learning dynamic.

paper

SOURCE

https://huggingface.co/papers/2605.16787

← back to feed