Read the dollar figures as API-equivalent value, not money spent
Everything below ran on the flat-rate subscription. The $177.04 is what this work would have cost at pay-per-token API rates — the local cost panel's way of measuring intensity. No credits were bought; per network rule #1, none ever are. It's a yardstick for how heavy the run was, not a bill.
01 Session at a glance
Wall-clock (9h 56m) is roughly double API time (4h 52m) — the signature of long, mostly-idle background sessions waiting on subagents rather than continuous compute.
02 Where the cost went — by model
One number explains most of the run: Fable 5 was 84% of the spend. It did the bespoke design work it's built for — but it did it 78 times' worth, at roughly twice Opus's per-token rate.
| Model | Input | Output | Cache read | Cache write | Value | Share |
|---|---|---|---|---|---|---|
| Fable 5 | 100.7k | 1.2M | 42.8M | 3.5M | $148.15 | |
| Opus 4.8 | 17.3k | 367.4k | 22.5M | 963.3k | $27.30 | |
| Haiku 4.5 | 1.5k | 21.8k | 1.4M | 1.1M | $1.59 | |
| Total | 119.5k | ~1.6M | 66.7M | 5.6M | $177.04 |
Caching worked — that's not where the money went
Fable read 42.8M tokens from cache against only 100.7k of fresh input — a cache hit rate well over 99%. The prompt cache did its job. The cost came from output volume (1.2M generated tokens of bespoke HTML/CSS) and the sheer number of large-context requests at Fable's premium rate. The lesson isn't "cache better" — it's "generate less premium output by templating the design once instead of 78 times."
03 Limit consumption
That last gauge is the quiet indictment: 0% Sonnet usage. The cheap, abundant capacity that should have done the videos pages, sources pages, AF files and page-rewrites was never touched — all of it went to Fable instead. The weekly cap hit 42% on a fraction of the job precisely because the work was loaded onto the most expensive model and none onto the cheapest.
04 What the tooling itself flagged
The cost panel's own "what's contributing to your limits usage" diagnostics (last 24h) read like a checklist of this run's mistakes — independent confirmation of the post-mortem's root causes:
Longer sessions cost more even when cached. Each per-site agent carried a huge context. → /compact mid-task, /clear between sites.
The overnight background run. Continuous usage adds up fast — fine if intentional, costly if unbounded.
All sessions share one limit. The four concurrent design waves drained the cap together — exactly what blew the session limit mid-run.
Each subagent runs its own requests. 78 site-agents is a lot of independent request streams. → spawn deliberately; give simple subagents a cheaper model.
general-purpose subagents
The default subagent type ran frequently at full model cost. → configure cheaper models / tighter prompts for routine subagents.
05 Skills & subagents
| Source | % of usage |
|---|---|
Subagent: general-purpose | 21% |
Skill: /claude-api | 4% |
Skill: /frontend-design | 3% |
Plugin: frontend-design | 3% |
Skill: /schedule | 2% |
Skill: /youtube-publish | 1% |
Approximate, based on local sessions on ccmidwbee only — does not include other devices or claude.ai. These are independent characteristics of usage, not a clean breakdown that sums to 100%.
06 The one-line reading
$177 of API-equivalent value, 84% of it on Fable, 0% on Sonnet, 78% in >150k-context sessions, 55% with four sessions racing the same cap — for one-eighth of the job. The numbers say the same thing the post-mortem does: right model for craft, wrong model for bulk, no budget gate, too much parallelism. Fix the model mix and the cap stops being the bottleneck.
Spend didn't follow revenue potential
The deeper miss the dollar figures expose: every site drew the same premium Fable treatment regardless of what it can earn. A 22-rev ham-radio hobby site (austinrepeater.com) cost roughly the same ~$15 of Fable as a 91-rev real-estate site (austinhomesearches.com) — but only one of them earns that spend back. The fix is to let the rev score route the model: bespoke Fable for high-rev sites, cheap Sonnet/Haiku template stamps for low-rev ones. The full revenue-sequenced scorecard lives in the post-mortem & playbook.