[HUGGINGFACE]score: 0.84
The Unlearnability Phenomenon in RLVR for Language Models
May 15, 2026
RLVR exhibits an unlearnability phenomenon where hard examples remain unlearnable despite correct rollouts; cross-example gradient analysis reveals optimization and sampling techniques fail to resolve this learning dynamic.
paper