[HUGGINGFACE]score: 0.89
MobileMoE: Sub-Billion Active Parameter MoE Models Hit New On-Device Pareto Frontier
May 25, 2026
MobileMoE establishes a scaling law for on-device MoE models with 0.3–0.9B active / 1.3–5.3B total parameters, using moderate sparsity with fine-grained and shared experts to optimize simultaneously for mobile memory and compute constraints.
paper
HOW THIS AFFECTS YOU
●
builderYou can deploy MoE-class model quality on mobile hardware at sub-billion active parameter counts — previously only dense small models were practical on-device.
●
researcherThe derived on-device MoE scaling law and four-stage training recipe offer a concrete methodology for sub-billion MoE architecture search under hardware constraints.