FID Variance from Retraining Is 3.2x Larger Than from Resampling on ImageNet 256x256
June 17, 2026
Across hundreds of SiT models trained on class-conditional ImageNet 256x256, retraining with a different seed shifts FID 3.2x more in Inception feature space than resampling from a fixed model, driven by random initialization, data ordering, and per-step flow noise. Single-seed FID reporting in papers significantly understates reproducibility uncertainty.
HOW THIS AFFECTS YOU
●
researcherThis quantifies how unreliable single-run FID comparisons are — results you're reporting or reviewing may reflect training seed variance more than architectural differences.