[r/LocalLLaMA]score: 0.10
Dual RTX 3060 Runs Qwen3.6-27B at 30–50 t/s for ~$400
May 26, 2026
A dual RTX 3060 setup (24GB VRAM total, ~$400) runs Qwen3.6-27B at 30–50 tokens/sec decode with MTP, outperforming a single RX 7900 XTX on prefill stability despite older PCIe 3.0 x8 bandwidth per slot.
discussion
HOW THIS AFFECTS YOU
●
builderYou can run a capable 27B model locally at usable inference speeds for under $400 in GPU hardware using consumer Nvidia cards, lowering the barrier for on-device development.
●
researcherUseful data point on multi-GPU consumer inference tradeoffs — dual 3060s beat a higher-end single AMD card on prefill consistency for this model size.