Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

As shipping agentic capabilities becomes table stakes among foundation model companies, Anthropic is releasing Claude Sonnet 5, a more powerful and agentic version of the lab’s midsize model.

“It can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models,” Anthropic said in a blog post .

That framing mirrors what OpenAI and Google have said about their own recent releases. OpenAI’s GPT-5.6 Sol was launched in preview last week, and it is also the firm’s most agentic model yet, allowing users to split work across subagents for longer autonomous tasks. Google’s Gemini 3.5 Flash , which launched in May, was pitched as a shift from a conversational chatbot to an agentic tool that plans, builds, and iterates on real work with minimal human input.

Sonnet 5’s pitch is confirmation that agentic capability is the new baseline expectation at every price tier. Now the differentiator isn’t going to be who can do agentic work best, but how cheaply they can do it and how reliably without human oversight.

Sonnet 5 promises performance close to that of Opus 4.8 , but for much lower costs. Starting Tuesday, Claude Sonnet 5 will be the default model for free and Pro plans and is available for every subscription.

At launch, Sonnet 5 is priced at $2 per million input tokens and $10 per million output tokens through August 31, after which the price will jump to $3 per million input tokens and $15 per million output tokens. That makes Sonnet 5 cheaper than Opus 4.8, as well as OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro. (It’s still more expensive than Gemini 3.5 Flash.)

The new model also demonstrates significant improvements over its predecessor Sonnet 4.6, released in February , on agentic performance like reasoning, tool use, software coding, and knowledge work, according to Anthropic.

For example, on one benchmark, Sonnet 5 scores a 63.2% on agentic coding, compared to Opus 4.8’s 69.2% and Sonnet 4.6’s 58.1%. On a knowledge work benchmark, Sonnet 5 actually slightly outperforms Opus 4.8, which is known for winning on solving the hardest problems like making subtle judgment calls and deep research.

“Opus 4.8 is still the model of choice for higher accuracy on these tasks, but Sonnet 5 provides developers with lower-priced options that are of much higher quality than what was previously available,” Anthropic says. “Between Sonnet 5 and Opus 4.8, users can adjust the effort level to find the right balance of cost and performance.”

According to testers cited in the blog post, Sonnet 5 also excels at finishing complex tasks where previous model versions would have stopped short and “checks its own output without explicitly being asked.”

“We handed Claude Sonnet 5 a two-part job — update Salesforce account tiers, send a launch announcement to enterprise contacts — and it finished end to end,” Daniel Shepard, a senior engineer at Zapier, said in a statement. “That used to stall halfway. For day-to-day automation, it’s a no-brainer. ”

On safety, Sonnet 5 also demonstrates a lower rate of “undesirable behaviors” like cooperation with misuse and deception than its predecessor, making it safer to use in agentic contexts. It’s better at refusing malicious requests and sidestepping hijack attempts in prompt-injection attacks. It also hallucinates and engages in sycophantic behavior at a lower rate than Sonnet 4.6.

That said, it’s not on the same level as Opus 4.8 and Claude Mythos Preview when it comes to misaligned behavior. “Evaluations also show that it has a much lower ability to perform dangerous cybersecurity tasks than our current Opus models,” reads the blog post.

Lovable co-founder Fabian Hedin said in a statement that Claude Sonnet 5 “refuses unsafe requests cleanly and consistently.”

“At Lovable, we’re putting powerful tools in the hands of millions of builders,” Hedin said. “A model that knows when to say no is just as important as one that knows how to build.”

Updated to correct that the price of output tokens is $15 per million output tokens after August 31.

When you purchase through links in our articles, we may earn a small commission . This doesn’t affect our editorial independence.

Rebecca Bellan is a senior reporter at TechCrunch where she covers the business, policy, and emerging trends shaping artificial intelligence. Her work has also appeared in Forbes, Bloomberg, The Atlantic, The Daily Beast, and other publications.

You can contact or verify outreach from Rebecca by emailing rebecca.bellan@techcrunch.com or via encrypted message at rebeccabellan.491 on Signal.

Last chance to save up to $190 on TechCrunch Founder Summit. Join 1,000+ founders and VCs at all stages for real-world scaling insights and connections that move the needle. Savings end June 26, 11:59 p.m. PT .

Flipper Device’s new Busy Bar is a customizable display for productivity Ivan Mehta

Flipper Device’s new Busy Bar is a customizable display for productivity

Ford rehires ‘gray beard’ engineers after AI falls short Anthony Ha

Ford rehires ‘gray beard’ engineers after AI falls short

Govee’s smart nugget ice maker makes every iced drink feel like a luxury Aisha Malik

Govee’s smart nugget ice maker makes every iced drink feel like a luxury

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on Kate Park

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

Trump administration proposes axing brake-pedal requirement for AVs in a boost for Tesla Sean O'Kane

Trump administration proposes axing brake-pedal requirement for AVs in a boost for Tesla

Former Infosys chief has a new startup that wants to challenge the IT services world Jagmeet Singh

Former Infosys chief has a new startup that wants to challenge the IT services world

OpenAI unveils its first custom chip, built by Broadcom Russell Brandom

OpenAI unveils its first custom chip, built by Broadcom

Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

Pontos-chave

Análise editorial

O que esta cobertura entrega

Sobre este artigo