HACKOBAR_item
[ALPHAXIV]score: 0.30

Reinforcement Learning for Efficient Recursive Models

May 13, 2026
Researchers used reinforcement learning to fine-tune 4B parameter recursive language models matching Claude Sonnet 4.6 performance while reducing model size and inference costs significantly.