HACKOBAR_item
[arXiv]score: 0.41

Optimistic Dual Averaging Unifies Modern Optimizers

May 13, 2026
SODA unifies modern optimizers including Muon, Lion, AdEMAMix, and NAdam under a generalized Optimistic Dual Averaging framework, and introduces a theoretically-grounded 1/k weight decay schedule eliminating manual tuning. Empirical results show consistent gains across scales with no additional hyperparameters. Optimization researchers and large-scale training practitioners should take note.
cs.LG