[r/LocalLLaMA]score: 0.28

LiquidAI LFM2.5-8B-A1B: Fast Hybrid Model with 1B Active Params for On-Device Use

May 28, 2026

LiquidAI's LFM2.5-8B-A1B is an 8B-parameter hybrid model with only 1B active parameters, built for on-device deployment with day-one support for llama.cpp, MLX, vLLM, and SGLang. It claims fastest throughput in its size class on both CPU and GPU, competitive with larger dense and MoE models on instruction following and agentic tasks.

new model

HOW THIS AFFECTS YOU

●

builderYou can run this today via llama.cpp or MLX on consumer hardware — the 1B active parameter footprint makes it viable for edge and mobile agentic pipelines where larger models are impractical.

●

researcherThe hybrid architecture with extended pretraining and RL on top of LFM2 is worth examining for efficiency-performance tradeoffs at the sub-2B active parameter regime.

SOURCE

https://huggingface.co/LiquidAI/LFM2.5-8B-A1B

← back to feed