●builderCost-vs-score data helps you pick models for reasoning-heavy tasks — gpt-5.4-nano at $0.01 delivers 68.2 vs gpt-5.5 medium's 95.4 at $0.10.
●researcherProvides a complex multi-step reasoning benchmark with cost-performance tradeoffs across 15 models including several not yet widely evaluated.