[r/LocalLLaMA]score: 0.30
Gemma 4 MTP released
May 5, 2026
Google quietly dropped Gemma 4 MTP drafter checkpoints on Hugging Face across four model sizes: 31B, 26B-A4B MoE, E4B, and E2B. These are speculative decoding draft models paired with Gemma 4 targets, delivering up to 2x decoding speedup with zero quality degradation. Inference engineers targeting low-latency or on-device deployment should prioritize evaluation immediately.
new model