Anthropic shipped Claude Fable 5 on June 9, 2026, and it landed with a price tag that forces a real decision: the Claude Fable 5 vs Opus 4.8 question is, at its core, a money question. Fable 5 costs exactly twice as much per token as Opus 4.8. Input runs $10 per million tokens against Opus 4.8’s $5, and output runs $50 per million against $25. So before you read a single benchmark claim, the math is fixed: same vendor, same Messages API, and a 2x premium for the newer model. The interesting part is figuring out when that premium pays for itself, and when you are lighting money on fire. If you want the full background on the older model first, our guide to Claude Opus 4.8 covers what it is and where it sits in the lineup.
TL;DR
Claude Fable 5 and Opus 4.8 are the same family. Fable 5 costs exactly 2x Opus 4.8 per token ($10/$50 vs $5/$25). For most chat, code generation, and retrieval work, Opus 4.8 is the smarter buy. Reach for Fable 5 when you need very long-horizon autonomous work that stays coherent across millions of tokens. Otherwise, save the money.

Claude Fable 5 vs Opus 4.8 at a glance
Here is the side-by-side so you can see the tradeoff in one place.
| Dimension | Claude Fable 5 | Claude Opus 4.8 |
|---|---|---|
| API model id | claude-fable-5 |
claude-opus-4-8 |
| Input price (per 1M tokens) | $10.00 | $5.00 |
| Output price (per 1M tokens) | $50.00 | $25.00 |
| Relative cost | 2x Opus 4.8 | Baseline |
| Context | Operates across millions of tokens; no published fixed number | 1M-token context window |
| Thinking and effort | Adaptive thinking | Adaptive thinking + effort (low/medium/high/xhigh/max) |
| Positioning | A Mythos-class model made safe for general use; most capable Anthropic has made generally available | Highly capable; was Anthropic’s most capable generally available model before Fable 5 |
| Best for | Very long-horizon autonomous work, huge migrations, multi-hour agents | Most chat, codegen, RAG, and interactive workloads |
A note on context: Anthropic has not published an exact context-window number for Fable 5. The company describes it as staying focused across millions of tokens, so treat its long-context behavior as a qualitative strength rather than a spec you can pin down. Opus 4.8, by contrast, has a documented 1M-token window. If your decision hinges on a precise context figure, Opus 4.8 is the one with a number you can quote in a design doc; Anthropic’s model overview docs list the published specs for the lineup. For a plain-language intro to the new model, our explainer on what Claude Fable 5 is covers the basics, and our breakdown of Opus 4.8 pricing goes deeper on the cost side.
Price: Fable 5 costs exactly twice as much
This is the cleanest fact in the whole comparison, so start here.
Fable 5 charges $10 per million input tokens and $50 per million output tokens. Opus 4.8 charges $5 per million input and $25 per million output. Two times the input rate, two times the output rate. There is no asterisk, no tiered discount that changes the ratio, and no long-context surcharge that tilts it. Fable 5 is 2x Opus 4.8 across the board. You can confirm the current rates on Anthropic’s pricing page.
Per 1,000 tokens, that works out to:
- Fable 5: $0.010 input, $0.050 output
- Opus 4.8: $0.005 input, $0.025 output
Those numbers look tiny in isolation, which is exactly how budgets get away from teams. The ratio is what matters once volume shows up.
Run a realistic monthly example. Say a production feature processes 200 million input tokens and generates 40 million output tokens in a month. On Opus 4.8, that is 200 x $5 + 40 x $25, or $1,000 + $1,000 = $2,000. On Fable 5, the same workload is 200 x $10 + 40 x $50, or $2,000 + $2,000 = $4,000. Same tokens, same work, $2,000 versus $4,000. The premium scales linearly with usage, so a workload that runs all day, every day, doubles its model bill the moment you swap the model string.
That framing matters because the upgrade decision is not “is Fable 5 better.” It almost always is. The decision is “is Fable 5 better by enough to justify doubling this specific line item.” For a low-volume internal tool, $2,000 extra a month might be noise. For a high-volume customer-facing endpoint, it can be the difference between a healthy margin and a red one. Price the workload, not the model. If you want a deeper cost workup on the cheaper option, see our Opus 4.8 pricing analysis; for the new model’s rate card specifically, our Claude Fable 5 pricing guide has the details.
Capability: where Fable 5 pulls ahead
Fable 5 is not a marketing reskin of Opus 4.8. It is a genuinely more capable model, and the gap shows up most clearly in long, autonomous work.
In its Claude Fable 5 announcement, Anthropic describes Fable 5 as a Mythos-class model made safe for general use, the most capable model the company has made generally available. It is built for very long-horizon autonomous work and stays focused across millions of tokens. That last phrase is the whole pitch. Plenty of models can write a good function or answer a question. Far fewer can run for hours, hold a coherent plan across an enormous amount of context, and still be making good decisions at the end.
The clearest demonstration is the Stripe result. Fable 5 performed a 50-million-line Ruby codebase migration for Stripe in a single day, work the team estimated would have taken two months or more. That is not a benchmark abstraction. It is a real migration of a real codebase at a scale where the bottleneck is sustained coherence, not raw per-token quality. A model that drifts or loses the thread after a few hundred thousand tokens cannot do that job at any price. This is the workload Fable 5 was built for.
Memory amplifies the gap. In a Slay the Spire test, giving Fable 5 persistent file memory produced a 3x improvement over Opus 4.8. The lesson generalizes well beyond a card game: when a task spans many steps and the model can write notes to itself and read them back, Fable 5 compounds that memory into markedly better outcomes over long runs. If your agent maintains a scratchpad, a plan file, or a structured memory store across a long session, that is exactly the setup where Fable 5’s long-horizon strength turns into measurable wins.
On benchmarks, Fable 5 reached state-of-the-art placements on nearly all of them. It topped FrontierCode and FrontierBench from Cognition, CursorBench, and Hebbia’s Finance Benchmark, among others. Anthropic has not released public scores for these, so treat the placements as directional rather than as numbers to quote. The pattern is consistent: Fable 5 leads on coding, agentic, and finance-flavored evaluations. The takeaway is not a leaderboard delta; it is that the model lands at or near the top of the kinds of hard, multi-step tasks that map to expensive real-world work.
One more thing worth knowing for a fair comparison: Fable 5 ships with safeguards that route certain sensitive queries, cybersecurity, biology and chemistry, and model-distillation requests, to Opus 4.8 instead of answering directly. Anthropic says this triggers in under 5% of sessions, so it rarely affects normal use, but it is a real behavioral difference. For most workloads you will never see it. For a head-to-head with other vendors, our Opus 4.8 versus GPT-5.5 and Gemini 3.5 comparison and the matching Fable 5 versus GPT-5.5 and Gemini 3.5 piece put each model against the broader field.
Where Opus 4.8 is the smarter buy
Here is the part the launch announcements tend to skip: for a large share of production work, Opus 4.8 is the better economic choice, and it is not close.
Opus 4.8 was Anthropic’s most capable generally available model before Fable 5 arrived. It did not get worse the day Fable 5 shipped. It is still a strong, frontier-tier model with a documented 1M-token context window, adaptive thinking, and the full effort range from low through max. For most chat interfaces, most code generation, and most retrieval-augmented generation, Opus 4.8 produces excellent results at half the per-token cost. If the task fits comfortably in a million tokens and resolves in one turn or a short loop, you are very likely paying double for headroom you will not use when you pick Fable 5.
Three workload shapes lean hard toward Opus 4.8:
- Interactive chat and assistants, where each turn is short, latency matters, and the model rarely needs to sustain a multi-hour plan.
- Code generation and review at the function, file, or pull-request level, where the context is bounded and the task completes quickly.
- RAG and document Q&A, where you stuff relevant context into a 1M-token window and ask a focused question. The window is the asset here, and Opus 4.8 has a documented one.
There is also a quietly persuasive argument hiding in Fable 5’s own design. When Fable 5 hits one of its safeguard categories, it routes the query to Opus 4.8. The newer model literally falls back to the older one for the sensitive cases. That is a strong signal that Opus 4.8 is trusted, capable, and good enough to stand in for the flagship on real traffic. If it is the safety net under the most expensive model Anthropic sells, it is more than adequate for the bulk of your everyday requests.
The cost-sensitive default, then, is straightforward: start on Opus 4.8, measure, and upgrade only the specific workloads that prove they need long-horizon autonomy. If even Opus 4.8 is more than a workload needs, Claude Sonnet 4.6 sits below it at $3 input and $15 output, and handles high-volume, simpler tasks at a fraction of the cost. For setup details on the cheaper Claude tiers, our Opus 4.8 API guide walks through the calls.
A decision framework: which should you choose?
Skip the vibes and route by workload. These rules cover most real cases.
- Short, single-turn tasks (chat, classification, extraction, a quick code snippet): Use Opus 4.8. The 2x premium buys you nothing here because the task never exercises Fable 5’s long-horizon advantage.
- Bounded code generation and review (a function, a file, a PR): Use Opus 4.8. Strong results, half the cost, and the context fits.
- RAG, document Q&A, and analysis inside a 1M-token window: Use Opus 4.8. The documented million-token window is the feature you are paying for, and Opus 4.8 has it.
- Multi-hour autonomous agents that must stay coherent across a very long run: Use Fable 5. This is the workload it was built for, and the coherence gap justifies the price.
- Huge migrations and refactors that span an enormous codebase in one go: Use Fable 5. The Stripe 50-million-line migration is the template. At that scale, sustained focus is the bottleneck, and Fable 5 clears it.
- Long-running agents with persistent memory: Use Fable 5. The 3x memory result says the compounding payoff over long sessions is real.
- Cost is the hard constraint: Use Opus 4.8, or drop to Sonnet 4.6 for high-volume simple work. Reserve Fable 5 for the few jobs that genuinely need it.
The meta-rule: default to Opus 4.8, then promote individual workloads to Fable 5 only when they demonstrate a need for long-horizon autonomy. Doubling cost across the board because one job benefits is the most common way teams overspend on a flagship model.
Switching between them in code
The good news for anyone weighing this decision: switching is trivial. Both models live behind the same Messages API. There is no SDK migration, no new auth flow, no reshaped request body. You change the model id string and nothing else. claude-fable-5 for the new model, claude-opus-4-8 for the cheaper one.
import anthropic
client = anthropic.Anthropic()
# Cheaper, frontier-tier default
response = client.messages.create(
model="claude-opus-4-8", # swap to "claude-fable-5" for the flagship
max_tokens=16000,
thinking={"type": "adaptive"},
messages=[{"role": "user", "content": "Summarize this design doc and list open questions."}],
)
for block in response.content:
if block.type == "text":
print(block.text)
Because the only difference is the model string, you can route per request. Send everyday traffic to claude-opus-4-8, and flip the string to claude-fable-5 for the handful of jobs that need long-horizon autonomy, all from the same client and the same code path. That makes a default-cheap, upgrade-on-demand strategy easy to implement: a single config value or a one-line conditional decides which model handles a given request. For the full request surface on the older model, see our Opus 4.8 API walkthrough; the matching Fable 5 API guide covers the new model.
Compare them yourself with Apidog
Pricing tables and benchmark claims only get you so far. The honest way to settle the Claude Fable 5 vs Opus 4.8 question for your workload is to fire the same prompt at both model ids and look at what comes back. That is a job Apidog handles cleanly.
Set up one request against the Anthropic Messages API, then duplicate it and change only the model field, claude-fable-5 in one, claude-opus-4-8 in the other. Send both with a prompt that actually resembles your production traffic, not a toy question. Then compare the two responses side by side: which answer is more correct, which is more complete, and whether the quality gap is large enough to matter for your use case.
Apidog also surfaces the numbers that drive the cost decision. You can watch the latency on each call and read the token usage straight from each response, including the input and output counts that determine what you will actually pay. Put the usage from both models next to the quality difference, and the 2x premium stops being abstract. You can see, for your real prompt, whether Fable 5’s output is worth the extra tokens and dollars or whether Opus 4.8 already does the job. Save the two requests as a small collection and you have a repeatable A/B harness you can rerun whenever your prompts change or a new model lands. If you want to try it, Download Apidog and build the two requests in a few minutes. It is a faster path to a confident answer than reading one more spec sheet, and Apidog keeps the whole comparison in one place.




