[r/LocalLLaMA]score: 0.09
Qwen3.6 Q6 Quant Enables Viable Local Coding Agents on Dual 3090s
May 27, 2026
A user reports Qwen3.6 at Q6 quantization running via llama.cpp on dual RTX 3090s achieves 20–50 tokens/second with MTP enabled, delivering quality comparable to paid APIs for coding agent tasks.
discussion
HOW THIS AFFECTS YOU
●
builderYou can run a competitive local coding agent on dual consumer GPUs using Qwen3.6 Q6 + llama.cpp with MTP, potentially replacing paid API calls for latency-sensitive or privacy-sensitive workflows.