Inverting the Bellman Equation Recovers World Models from Value Functions
June 23, 2026
New theoretical work proves that RL agents trained on a sufficiently rich goal distribution encode a unique, accurate world model within their value functions — recoverable by inverting the Bellman equation. A notable finding is that agents implicitly model latent variables never included in their reward signal.
HOW THIS AFFECTS YOU
●
researcherThis challenges the model-free/model-based dichotomy and has direct implications for interpretability of RL agents and for designing richer goal distributions to elicit better implicit world models.