●builderYou can swap in this model where throughput is the bottleneck and accept a 1.3% quality trade-off, potentially cutting inference costs significantly.
●researcherThe two-tower hybrid architecture offers a concrete data point on diffusion LM viability at 30B scale with quantified quality-throughput trade-offs.