[arXiv]score: 0.24

EdgeRazor: A Lightweight Framework for Large Language Models via Mixed-Precision Quantization-Aware Distillation

May 7, 2026

EdgeRazor is a new arxiv-released framework combining mixed-precision quantization-aware training with knowledge distillation for edge-deployed LLMs, targeting sub-4-bit regimes where PTQ typically collapses. Unlike prior QAD methods requiring manual feature selection and teacher-specific data, EdgeRazor automates distillation targets using mixed-precision granularity. Edge ML engineers compressing LLMs for embedded or mobile inference should evaluate this against GPTQ and QLoRA baselines immediately.

cs.LGcs.AI

SOURCE

https://arxiv.org/abs/2605.04062

← back to feed