[X]score: 0.30

OpenAI CRO on Evals Crisis, Scaling Laws, and Long-Horizon AI

June 25, 2026

OpenAI Chief Research Officer Mark Chen discusses why evals are in crisis and how benchmark-maxing undermines progress, alongside OpenAI's compute allocation strategy and research roadmap priorities including multimodal reasoning and long-horizon task completion.

HOW THIS AFFECTS YOU

●

researcherChen's framing of the evals crisis and benchmark-maxing problem is worth tracking as it signals where OpenAI sees measurement methodology breaking down.

●

founderWorth watching because OpenAI's stated research bets around long-horizon real-world tasks signal where the next capability jumps — and competitive threats — are likely to emerge.

read original ↗x.com

← back to feed