Task budgets — The Fable Cookbook

The shape

Beta header task-budgets-2026-03-13; the budget covers the whole agentic loop — thinking, tool calls, and final output combined:

response = client.beta.messages.create(
    betas=["task-budgets-2026-03-13"],
    model="claude-fable-5",
    max_tokens=64000,
    thinking={"type": "adaptive"},
    output_config={
        "effort": "high",
        "task_budget": {"type": "tokens", "total": 128000},
    },
    messages=[{"role": "user", "content": long_agentic_task}],
)

Budget vs. ceiling

	`task_budget`	`max_tokens`
Model aware of it	Yes — sees a running countdown	No
Enforcement	Suggestion — model self-moderates	Hard per-response cap
Scope	The whole task loop	One response
Behavior at the limit	Prioritizes and wraps up gracefully	Truncates mid-thought

Use both: a generous task_budget so Fable paces itself, and max_tokens as the hard backstop.

Choosing the number

Minimum is 20,000 — below that the request errors.
Generous for open-ended work, tighter for latency-sensitive runs. Too tight and the model completes the task less thoroughly, citing the budget as its constraint.
Don't guess in production: run the workload once unbudgeted, measure with response.usage, then set the budget from data.
Per-step depth is still effort — the budget shapes the run's total spend, not how hard each step thinks.

Moral: tell the traveler how far the inn is, and they'll pace the horse themselves.

This recipe in about a minute

Part of a bigger loop: this recipe is one piece of running Fable 5 unattended. The full system — constitution, separation of powers, earned-trust ledgers, budgets, and injection defense — is in Build an autonomous agent with Fable 5, the nine-layer field guide.

← Recipe 05: Memory files Cookbook index →