Claude Opus 4.8 leads SWE-rebench with 56.5% success rate
July 1, 2026
The SWE-rebench leaderboard update shows Claude Opus 4.8 xhigh achieving 56.5% on software engineering tasks, followed by GLM-5.2 at 51.1%. Local models like Qwen3.6-27B show competitive performance for self-hosted coding agents.
HOW THIS AFFECTS YOU
●
builderYou can use these benchmark scores to select models for autonomous coding agents.
●
researcherThese results provide updated performance baselines for software engineering agent evaluations.