Training dynamics study on 4.26M-param Llama model under 20M token budget
June 12, 2026
A repeated-measures study on a 4.26M-parameter Llama-style model trained on TinyStories with a 20M-token CPU budget tracks validation loss, perplexity, volatility, and spike behavior across 126 seed-by-interval observations. ANOVA confirms statistically significant interval effects, but the small scale limits generalizability to production-relevant model sizes.
HOW THIS AFFECTS YOU
●
researcherProvides a rigorous repeated-measures methodology for studying training instability at tiny scale, though findings may not transfer to larger compute regimes.