[arXiv]score: 0.37
W4A4 HiFloat4 Quantization for Wan2.2 Video Generation Model
May 27, 2026
Tail-Aware HiFloat4 applies W4A4 post-training quantization to Wan2.2's transformer linear layers using activation-tail-aware percentile calibration to handle outliers, keeping boundary modules in high precision while preserving HiFloat4 arithmetic unchanged.
cs.AI
HOW THIS AFFECTS YOU
●
builderYou can reference this pipeline to deploy Wan2.2 at W4A4 precision with reduced memory and compute cost while maintaining sampling quality via selective high-precision boundary modules.
●
researcherWorth watching as a concrete PTQ recipe for large video diffusion transformers under HiFloat4 format, with a specific outlier-handling calibration technique applicable to other W4A4 quantization efforts.