Software engineering
| Claim | Source |
|---|---|
| 50-million-line Ruby migration completed in one day (team scoped two months) | Stripe, via announcement |
| Highest frontier score on FrontierCode — at medium effort (chart, chart 2) | Cognition |
| "State of the art model on CursorBench" | Michael Truell, Cursor |
| Long-horizon autonomy "exceeded previous benchmarks" | Mario Rodriguez, GitHub |
| Highest on ViBench end-to-end vibe-coding, "nearly saturating base use cases" | Michele Catasta, Vibe |
Knowledge work & research
| Claim | Source |
|---|---|
| First past 90% on core analytics benchmark — 10 points over Opus; top of Finance Benchmark | Izzy Miller, Hebbia |
| "Aced trading-analysis evaluations nearly across the board" | IMC |
| Strongest on frontier physics at a third of the reasoning tokens; 36 hours ≈ GPT-5.5's four days | Matthew Pines, Notation Capital |
Vision, memory, autonomy
| Claim | Source |
|---|---|
| Completed Pokémon FireRed vision-only with a minimal harness | Announcement |
| Rebuilds web-app source from screenshots; extracts precise values from scientific figures | Announcement |
| With file memory in Slay the Spire: improved 3× more than Opus 4.8; reached final act 3× more often | Announcement |
Mythos research results
Run with safeguards lifted, by vetted partners — context on the Mythos page:
| Claim | Source |
|---|---|
| ~10× acceleration on aspects of drug design; 9 of 14 protein targets yielded strong candidates (chart) | Announcement |
| Hypotheses preferred ~80% over Opus-class in blinded comparisons; one corroborated independently | Announcement |
| Genomics model beat a recent Science publication at 1/100th the size, from a week-long autonomous run | Announcement |
| Beat dedicated protein language models on AAV prediction (chart) | Announcement |
Reading the wall: everything above is launch-day material — Anthropic's own numbers and partner quotes from the announcement. Independent third-party evals get their own section here as they publish. If a claim has no receipt, it doesn't go up.