[X]score: 0.30

1. Agreed w @scaling01 that Mythos appears to be better GPT 5.5 on many metrics. 2. Mythos is definitely a major wakeup call wrt security, and will…

May 23, 2026

Unverified benchmark claims circulating on Twitter show a model called Mythos scoring 77.8% on SWE-bench Pro and 56.8% on HLE, versus GPT-5.5 at 58.6% and 41.4% respectively, with noted cybersecurity risks flagged by UK AISI. No architecture, origin, or release details are confirmed. If accurate, the SWE-bench and HLE gaps are substantial, but provenance is entirely unverified.

SOURCE

https://x.com/GaryMarcus/status/2058264901194518596#m

← back to feed