Next Forcing Adds Multi-Chunk Prediction to Speed Up World Model Training
June 8, 2026
Next Forcing augments autoregressive video generation with lightweight auxiliary modules that simultaneously denoise multiple future chunks during training, borrowing from multi-token prediction in LLMs. The approach targets slow convergence and limited accuracy at high frame rates in World Action Models, claiming faster training, higher accuracy, and accelerated inference.
HOW THIS AFFECTS YOU
●
researcherMulti-chunk prediction as a training objective for video diffusion world models is a concrete architectural modification with reported convergence and accuracy improvements worth replicating.