The first independent Fable 5 numbers arrive

Day two brought the first non-Anthropic data points. BenchLM places Fable 5 at #2 of 123 models on its provisional leaderboard (96/100 overall), with top-tier coding and agentic-tool-use scores — and a notably weaker #18 placement on multimodal/grounded tasks (79), the first independent wrinkle in the vision story. Reported SWE-Bench Verified: 95.0%.

Digital Today reports sustained 12-hour autonomous runs — consistent with the long-horizon claims, and the kind of duration that matters more than any single score for agentic work. Tom's Hardware and VentureBeat round out the mainstream technical coverage.

Caveats apply: provisional leaderboards move, and methodologies vary. These go on the evidence wall as independent-but-early. The week-long hands-on below remains the most thorough third-party test published so far.

← Previous: platform wave Next: credit-gate backlash →

The first independent numbers arrive

Keep reading