FlowR2A Uses Flow Matching to Unify Dense and Sparse Driving Planning
June 24, 2026
FlowR2A reframes simulation-based rewards as generative conditions rather than discriminative targets, using a flow-matching decoder to learn reward-conditioned action distributions from dense trajectory-reward pairs. This unifies dense supervision from scoring-based methods with dynamic proposal generation from anchor-based methods in a single model.
HOW THIS AFFECTS YOU
●
researcherThe reward-to-distribution reframing via flow matching is a technically novel approach to resolving the scoring-vs-anchor tension in multimodal planning — worth examining for applicability beyond driving.