[arXiv]score: 0.52
Polynomial Context-Truncation Sensitivity in Autoregressive Language Models: Sequential Wyner-Ziv Bounds for KV Cache Compression
May 26, 2026
Context truncation sensitivity in autoregressive LMs (0.5B–3B parameters across two model families) follows a power law rather than exponential decay, and under this polynomial sensitivity assumption, sliding-window KV cache schemes achieve the rate-distortion optimal per-token memory bound derived via sequential Wyner-Ziv coding theory.
cs.ITcs.AIcs.LGmath.IT