[r/LocalLLaMA]score: 0.30

Gemma 4 MTP released

May 5, 2026

Google quietly dropped Gemma 4 MTP drafter checkpoints on Hugging Face across four model sizes: 31B, 26B-A4B MoE, E4B, and E2B. These are speculative decoding draft models paired with Gemma 4 targets, delivering up to 2x decoding speedup with zero quality degradation. Inference engineers targeting low-latency or on-device deployment should prioritize evaluation immediately.

new model

SOURCE

https://www.reddit.com/r/LocalLLaMA/comments/1t4jq6h/gemma_4_mtp_released/

← back to feed