●builderFreezing the base diffusion model and adding task-specific heads is a low-cost path to multi-modal outputs if you already have a diffusion transformer in your pipeline.
●researcherThe finding that perceptual information is temporally distributed across the denoising trajectory is a concrete architectural insight with implications for how diffusion model internals are used.