[arXiv]score: 0.42
GORMPO Uses Generative Density Models to Constrain Offline RL to High-Density State-Action Regions
May 26, 2026
GORMPO integrates generative density estimation into model-based offline RL to restrict policy updates to high-density dataset regions, and empirically tests whether better OOD detection correlates with better offline policy performance.
cs.LGcs.AI
HOW THIS AFFECTS YOU
●
researcherThe empirical comparison of OOD detection quality versus downstream policy performance provides a useful diagnostic for understanding when density-based regularization actually helps in offline RL.