What Is Claude Sonnet 5? Features, Benchmarks, and Pricing

Claude Sonnet 5 is Anthropic’s newest mid-tier model, released on June 30, 2026. Anthropic calls it “the best combination of speed and intelligence” and “the most agentic Sonnet model yet.” The short version: it gets close to Opus 4.8 on agentic and tool-use tasks while costing far less. This guide covers what Sonnet 5 is, its full specs, the launch benchmarks, pricing, availability, and who should use it. If you plan to call the model over HTTP, you can test those requests in Apidog as you go.

button

Every section here links to a focused deep dive, so treat this page as the map and follow the links when you need detail on the API, pricing, or a head-to-head with Opus 4.8.

What Claude Sonnet 5 is

Claude Sonnet 5 is the successor to Claude Sonnet 4.6. Its API model ID is the exact string claude-sonnet-5, with no date suffix. It sits in the Sonnet tier, which Anthropic positions between the smaller Haiku models and the larger Opus and Fable models.

The headline story is about value. On tasks where the model uses tools, runs in a loop, or acts as an agent, Sonnet 5 lands within a few points of Opus 4.8. On pure reasoning with nothing to lean on, Opus still leads. So Sonnet 5 is the model you reach for when you want strong agentic performance without paying Opus prices.

It is also a drop-in upgrade for Sonnet 4.6. You change the model ID, then review three behavior changes and one tokenizer change. We cover those below and in the dedicated Sonnet 5 vs Sonnet 4.6 comparison.

Full specs

Here is what you get with claude-sonnet-5:

Spec	Value
Context window	1,000,000 tokens (1M)
Max output	128,000 tokens (128K)
Adaptive thinking	On by default
Effort parameter	low / medium / high / xhigh
Vision, prompt caching, tool use, batch, structured outputs	Supported
Priority Tier	Not available
Zero data retention (ZDR)	Supported for orgs with a ZDR agreement

A few notes on these:

The 1M context window is both the default and the maximum. There is no smaller context variant to choose from.
Adaptive thinking is on by default. This is a change from Sonnet 4.6, where sending no thinking field meant no thinking happened at all.
The effort parameter controls how much the model thinks and spends. You set it to low, medium, high, or xhigh depending on how hard you want the model to work.
The feature set matches Sonnet 4.6, with one exception: Priority Tier is not available on Sonnet 5.

For the full request and response shape, see the step-by-step API guide and Anthropic’s models overview.

The three behavior changes and the new tokenizer

If you are moving from Sonnet 4.6, three things changed at the API level. Miss them and your requests can return a 400 error or behave differently than before.

Adaptive thinking is on by default. Requests with no thinking field now run with adaptive thinking. To turn it off, send thinking: {type: "disabled"}. Because max_tokens caps the total output (thinking tokens plus response text), revisit max_tokens for workloads that used to run without thinking.
Manual extended thinking is removed. Sending thinking: {type: "enabled", budget_tokens: N} now returns a 400 error. Use adaptive thinking and the effort parameter instead.
Sampling parameters are not accepted. Setting temperature, top_p, or top_k to a non-default value returns a 400 error. Remove them when you migrate. Steer behavior through system-prompt instructions instead.

Assistant-message prefilling is still not supported and returns a 400, same as on Sonnet 4.6. Use structured outputs or system-prompt instructions to shape the response.

There is one more change that is easy to miss because it does not touch the API shape. Sonnet 5 uses a new tokenizer. The same input text produces roughly 30% more tokens than on Sonnet 4.6, about 1.3 times as many. Nothing about your request, response, or streaming code changes. But anything you measure or budget in tokens shifts:

usage fields and token-counting results are higher for the same text. Recount against Sonnet 5 instead of reusing your 4.6 numbers.
The 1M window holds less text on average, since each token now covers less text.
max_tokens budgets sized near your expected output may now truncate. Revisit them.
The per-request cost of equivalent text can rise even though the per-token price has not moved.

The what’s new page documents each of these, and the token counting docs show how to measure them.

Benchmark snapshot

The numbers below are Anthropic’s reported figures from launch. They are corroborated across launch-day writeups, so treat them as reported results rather than independent testing.

Benchmark	Sonnet 5	Opus 4.8	Sonnet 4.6
SWE-bench Pro (agentic coding)	63.2%	69.2%	58.1%
Terminal-Bench 2.1	80.4%	82.7%	Not reported
OSWorld-Verified (computer use)	81.2%	83.4%	78.5%

The pattern is consistent. With tools in the loop, Sonnet 5 lands within about 1 to 3 points of Opus 4.8. On pure reasoning with nothing to lean on, Opus leads by roughly 6 points. Sonnet 5 is stronger on agentic and tool tasks than on pure reasoning.

Against its predecessor, Sonnet 5 improves clearly: SWE-bench Pro rises from 58.1% to 63.2%, and OSWorld-Verified climbs from 78.5% to 81.2%.

The full breakdown, including what these benchmarks miss, lives in the Sonnet 5 benchmarks deep dive. You can also check Anthropic’s transparency hub for the underlying figures.

Pricing

Sonnet 5 keeps the same per-token rate as Sonnet 4.6, and it launched with an introductory discount.

Pricing	Input (per M tokens)	Output (per M tokens)
Introductory (through Aug 31, 2026)	$2	$10
Standard (from Sep 1, 2026)	$3	$15

The introductory rate of $2 per million input and $10 per million output runs through August 31, 2026. After that, it moves to the standard $3 per million input and $15 per million output, which matches Sonnet 4.6.

There is one catch worth planning around. Because the new tokenizer produces about 30% more tokens for the same text, the cost of an equivalent request can be higher than on Sonnet 4.6 even though the per-token rate is identical. Do not assume flat parity. Model your real workloads with token counting before you commit a budget.

For context, Opus 4.8 costs $5 per million input and $25 per million output, and Fable 5 costs $10 per million input and $50 per million output. Sonnet 5 sits well below both. For batch and prompt-caching rates, check Anthropic’s pricing page rather than any number quoted secondhand. The full pricing breakdown walks through a worked example.

Availability

Sonnet 5 is available across Anthropic’s own products and the major cloud platforms:

Claude API: available to all customers.
Claude apps: the default model for Free and Pro, and also available to Max, Team, and Enterprise.
Claude Code: available.
AWS: through Claude in Amazon Bedrock and the Claude Platform on AWS. Not on the legacy Bedrock InvokeModel or Converse path.
Google Cloud: available on Vertex AI.
Microsoft Foundry: in preview.

Since Sonnet 5 is the default on the free Claude plan, most people can try it without paying anything. The free access guide covers the honest free paths and their limits.

Safety summary

Anthropic’s system card reports a lower overall rate of undesirable behaviors than Sonnet 4.6, and the model is safer in agentic contexts. It shows lower hallucination and sycophancy than Sonnet 4.6, and it is better at refusing malicious requests and resisting prompt injection.

Sonnet 5 is also the first Sonnet-tier model with real-time cybersecurity safeguards. Requests that touch prohibited or high-risk cyber topics may be refused. A refusal comes back as a successful HTTP 200 with stop_reason: "refusal", not an error, so handle that stop reason in your code.

To be fair about the tradeoffs: Sonnet 5 shows higher misaligned-behavior rates than Opus 4.8 and Mythos Preview on Anthropic’s automated behavioral audit, and it has lower cyber-capability than the Opus models. Neither Sonnet model could develop a working exploit, scoring 0.0% on that measure.

Test the Sonnet 5 API with Apidog

When you call Sonnet 5, you are hitting an HTTP API with auth headers, JSON request and response bodies, rate limits, and errors. That is exactly the kind of thing Apidog is built to handle. Apidog is an all-in-one API development and testing platform, so you can send Sonnet 5 requests, save them as a reusable collection, and manage your keys per environment.

A practical setup looks like this:

Create a request to the Anthropic Messages endpoint and store your API key as an environment variable, not in the request body.
Save the request in a collection so your team can reuse it.
Add an assertion to check the response shape, for example that stop_reason is present so a refusal result does not slip through unnoticed.
Mock the endpoint when you want to build against a stable response without spending tokens.

Here is the Messages API shape you would send:

curl https://api.anthropic.com/v1/messages \
  --header "x-api-key: $ANTHROPIC_API_KEY" \
  --header "anthropic-version: 2023-06-01" \
  --header "content-type: application/json" \
  --data '{
    "model": "claude-sonnet-5",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Summarize this changelog entry in one sentence."}
    ]
  }'

Because adaptive thinking is on by default, that request runs with thinking unless you add thinking: {type: "disabled"}. Remember that max_tokens caps thinking plus response text together, so give it enough room. Once your request works, save it and add a test so you catch regressions when you swap models later. If you want to follow along, Download Apidog and import the request. The full API walkthrough has the complete flow, including the Python SDK version.

Who Sonnet 5 is for

Sonnet 5 is a good default in a lot of situations:

You build agents or tool-heavy workflows. This is where Sonnet 5 shines and stays close to Opus 4.8.
You run high volume and care about cost. The price gap versus Opus is large, and the introductory rate makes it larger through August.
You want a drop-in upgrade from Sonnet 4.6. Swap the model ID, review the three behavior changes, and re-measure your tokens.
You code in Claude Code or an editor. Sonnet 5 is a strong agentic coding default. See how to use it in Claude Code.

Reach for Opus 4.8 instead when you need the hardest pure reasoning, long-horizon autonomy, or the absolute highest quality and the extra cost is worth it. The Sonnet 5 vs Opus 4.8 comparison lays out that decision in detail. For background on the Opus tier itself, see what Claude Opus 4.8 is.

FAQ

Is Claude Sonnet 5 better than Opus 4.8? It depends on the task. On agentic and tool-use benchmarks, Sonnet 5 lands within about 1 to 3 points of Opus 4.8 at a much lower price. On pure reasoning, Opus 4.8 leads by roughly 6 points. Pick Sonnet 5 for agents and high volume, and Opus 4.8 for the hardest reasoning. The head-to-head comparison breaks it down.

What is the model ID for Claude Sonnet 5? The API model ID is claude-sonnet-5, with no date suffix. Set that string as the model value in your request.

How much does Claude Sonnet 5 cost? The introductory rate is $2 per million input tokens and $10 per million output tokens through August 31, 2026. After that it moves to the standard $3 per million input and $15 per million output. Note that the new tokenizer produces about 30% more tokens for the same text, so an equivalent request can cost more even at the same per-token rate.

Can I use Claude Sonnet 5 for free? Yes. Sonnet 5 is the default model on the free Claude plan at claude.ai and in the Claude Code free tier, subject to usage limits. See the free access guide for the honest paths and their caps.

Do I need to change my code to upgrade from Sonnet 4.6? Mostly you change the model ID. Then review three things: adaptive thinking is now on by default so revisit max_tokens, the budget_tokens extended-thinking field now returns a 400, and non-default sampling parameters now return a 400. Re-measure your token counts because of the new tokenizer.

button