Free Energy Model Tracks Gradient Descent Oscillations at Edge of Stability
June 5, 2026
A continuous-time effective model tracks gradient descent dynamics in the Edge of Stability regime by coupling average trajectory evolution with the covariance of fast oscillations, yielding an effective free energy combining risk and a curvature-related entropic term. For wide two-layer networks, a mean-field limit produces a novel kinetic description of training spikes.
HOW THIS AFFECTS YOU
●
researcherProvides a tractable analytical framework for understanding loss spikes during training of wide networks, which could inform learning rate scheduling and stability diagnostics.