FableGuide · Field Notes · The Big Run

Usage & Cost Ledger

The measured numbers behind the post-mortem — where the tokens actually went, what each model cost, and the local diagnostics that name the four habits driving the burn.

Read the dollar figures as API-equivalent value, not money spent

Everything below ran on the flat-rate subscription. The $177.04 is what this work would have cost at pay-per-token API rates — the local cost panel's way of measuring intensity. No credits were bought; per network rule #1, none ever are. It's a yardstick for how heavy the run was, not a bill.

01 Session at a glance

$177.04
total API-equivalent value
4h 52m
API time (9h 56m wall-clock)
15,834
lines added · 6,153 removed
~1.6M
output tokens across all models

Wall-clock (9h 56m) is roughly double API time (4h 52m) — the signature of long, mostly-idle background sessions waiting on subagents rather than continuous compute.

02 Where the cost went — by model

One number explains most of the run: Fable 5 was 84% of the spend. It did the bespoke design work it's built for — but it did it 78 times' worth, at roughly twice Opus's per-token rate.

ModelInputOutputCache readCache writeValueShare
Fable 5 100.7k1.2M42.8M3.5M $148.15
Opus 4.8 17.3k367.4k22.5M963.3k $27.30
Haiku 4.5 1.5k21.8k1.4M1.1M $1.59
Total 119.5k~1.6M66.7M5.6M $177.04

Caching worked — that's not where the money went

Fable read 42.8M tokens from cache against only 100.7k of fresh input — a cache hit rate well over 99%. The prompt cache did its job. The cost came from output volume (1.2M generated tokens of bespoke HTML/CSS) and the sheer number of large-context requests at Fable's premium rate. The lesson isn't "cache better" — it's "generate less premium output by templating the design once instead of 78 times."

03 Limit consumption

Current session9% used
resets 5:59am (America/Chicago) — after the downshift to Opus, the fresh session sat nearly empty
Current week — all models42% used
resets Jun 11, 4:59am — ~10 sites' worth of work consumed nearly half the weekly cap
Current week — Sonnet only0% used
untouched — the entire mechanical tier the run should have used went unspent

That last gauge is the quiet indictment: 0% Sonnet usage. The cheap, abundant capacity that should have done the videos pages, sources pages, AF files and page-rewrites was never touched — all of it went to Fable instead. The weekly cap hit 42% on a fraction of the job precisely because the work was loaded onto the most expensive model and none onto the cheapest.

04 What the tooling itself flagged

The cost panel's own "what's contributing to your limits usage" diagnostics (last 24h) read like a checklist of this run's mistakes — independent confirmation of the post-mortem's root causes:

78%of usage was at >150k context

Longer sessions cost more even when cached. Each per-site agent carried a huge context. → /compact mid-task, /clear between sites.

72%came from sessions active 8+ hours

The overnight background run. Continuous usage adds up fast — fine if intentional, costly if unbounded.

55%while 4+ sessions ran in parallel

All sessions share one limit. The four concurrent design waves drained the cap together — exactly what blew the session limit mid-run.

32%came from subagent-heavy sessions

Each subagent runs its own requests. 78 site-agents is a lot of independent request streams. → spawn deliberately; give simple subagents a cheaper model.

21%came from general-purpose subagents

The default subagent type ran frequently at full model cost. → configure cheaper models / tighter prompts for routine subagents.

05 Skills & subagents

Source% of usage
Subagent: general-purpose21%
Skill: /claude-api4%
Skill: /frontend-design3%
Plugin: frontend-design3%
Skill: /schedule2%
Skill: /youtube-publish1%

Approximate, based on local sessions on ccmidwbee only — does not include other devices or claude.ai. These are independent characteristics of usage, not a clean breakdown that sums to 100%.

06 The one-line reading

$177 of API-equivalent value, 84% of it on Fable, 0% on Sonnet, 78% in >150k-context sessions, 55% with four sessions racing the same cap — for one-eighth of the job. The numbers say the same thing the post-mortem does: right model for craft, wrong model for bulk, no budget gate, too much parallelism. Fix the model mix and the cap stops being the bottleneck.

Spend didn't follow revenue potential

The deeper miss the dollar figures expose: every site drew the same premium Fable treatment regardless of what it can earn. A 22-rev ham-radio hobby site (austinrepeater.com) cost roughly the same ~$15 of Fable as a 91-rev real-estate site (austinhomesearches.com) — but only one of them earns that spend back. The fix is to let the rev score route the model: bespoke Fable for high-rev sites, cheap Sonnet/Haiku template stamps for low-rev ones. The full revenue-sequenced scorecard lives in the post-mortem & playbook.