[r/LocalLLaMA]score: 0.11

Qwen 3.6 35B Outperforms 27B When KV Cache Quality Is Maintained

June 4, 2026

A practitioner's hands-on comparison finds Qwen 3.6 35B at IQ4NXL quant outperforms the 27B model in long-context tasks when KV cache is held at Q8, with the 27B degrading significantly on context overflow. The finding underscores that quantization strategy for KV cache matters as much as model size selection for local inference.

discussion

HOW THIS AFFECTS YOU

●

builderIf you're running Qwen 3.6 locally, KV cache quantization level has a measurable impact on effective intelligence at long contexts — Q8 KV cache with the 35B may outperform the nominally smarter 27B.

SOURCE

https://www.reddit.com/r/LocalLLaMA/comments/1twyoqe/you_guys_were_right_qwen_36_35b_is_goodand_kv/

← back to feed