[arXiv]score: 0.42

GORMPO Uses Generative Density Models to Constrain Offline RL to High-Density State-Action Regions

May 26, 2026

GORMPO integrates generative density estimation into model-based offline RL to restrict policy updates to high-density dataset regions, and empirically tests whether better OOD detection correlates with better offline policy performance.

cs.LGcs.AI

HOW THIS AFFECTS YOU

●

researcherThe empirical comparison of OOD detection quality versus downstream policy performance provides a useful diagnostic for understanding when density-based regularization actually helps in offline RL.

SOURCE

https://arxiv.org/abs/2605.24405

← back to feed