[arXiv]score: 0.36

ADMM-Q: An Improved Hessian-based Weight Quantizer for Post-Training Quantization of Large Language Models

May 13, 2026

ADMM-Q introduces a combinatorial ADMM-based post-training quantization algorithm targeting layer-wise weight quantization in LLMs, aiming to outperform GPTQ and RTN at sub-4-bit precision where those methods degrade significantly. The Hessian-guided optimization framework addresses the combinatorial structure of quantization directly. Practitioners compressing large models for edge or inference-cost reduction should evaluate this against GPTQ baselines.

cs.LG

SOURCE

https://arxiv.org/abs/2605.11222

← back to feed