[X]score: 0.63

NVIDIA Releases NVFP4 MiniMax-M3: 428B MoE Model at 4-bit on Blackwell

June 24, 2026

NVIDIA published an NVFP4-quantized version of MiniMax-M3, a 428B-parameter multimodal MoE model with a 1M-token context window, on Hugging Face. The 4-bit quantization targets Blackwell GPUs and delivers roughly 2x memory reduction compared to full precision, making the model more practical to deploy at scale.

HOW THIS AFFECTS YOU

●

builderYou can now run a 428B multimodal MoE with 1M-token context on Blackwell hardware with 2x lower memory overhead using the NVFP4 checkpoint on Hugging Face.

●

researcherWorth watching because NVFP4 quantization on a model this size with 1M-token context provides a concrete data point on Blackwell-native precision formats and their fidelity tradeoffs.

read original ↗x.com

← back to feed