[arXiv]score: 0.57
Adam Achieves Local Linear Convergence on Degenerate Polynomials Without External Schedulers
May 26, 2026
On a class of highly degenerate polynomials, Adam auto-converges with local linear convergence rate without learning rate schedulers, outperforming gradient descent's sub-linear rate via a decoupling mechanism between the second moment and squared gradient.
cs.LG
HOW THIS AFFECTS YOU
●
researcherProvides theoretical conditions for Adam's natural convergence advantage over GD on degenerate loss landscapes, with strong alignment between theoretical bounds and experiments — relevant for understanding Adam's behavior near flat minima.