[r/LocalLLaMA]score: 0.23
Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard!
May 16, 2026
Qwen3.6-35B-A3B scored 24.6% on Terminal-Bench 2.0, surpassing Gemini 2.5 Pro on Gemini CLI (19.6%) and Qwen3-Coder-480B (23.9%) — a 35B MoE model beating a 480B dense model on hard agentic tasks. The 9B variant scored 9.2%, proving sub-10B models are now viable on rigorous agentic benchmarks.
discussion