LLMs

Anthropic’s safety warnings may have just backfired — the government has pulled the plug on its most powerful AI

Publicado porRedacao AIDaily
5 min de leitura
Autor na fonte original: Connie Loizos

Anthropic isn't hiding its frustration. "We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people," the company wrote in a blog post.

Compartilhar:

The U.S. government on Friday ordered Anthropic to immediately shut off access to two of its most powerful AI models — Claude Fable 5 and Claude Mythos 5 — citing national security concerns. Anthropic announced on X that it has complied, but it made clear it thinks the government got this one wrong .

The directive, which Anthropic said it received on Friday at 5:21 pm ET, forces the company to disable both models for all users worldwide — not just the foreign nationals the government’s export control order was nominally aimed at. Access to Anthropic’s other models isn’t affected.

Why does any of this matter? Mythos is Anthropic’s most capable AI model, one the company previewed in early April and has kept tightly restricted ever since because of what Anthropic described as its exceptional ability to find security vulnerabilities in software. According to Anthropic, Mythos identified flaws in every major operating system and web browser it tested, so rather than release it broadly, the company launched a controlled program called Project Glasswing, sharing it with roughly 50 vetted organizations, including Amazon, Apple, Google, Microsoft, and CrowdStrike, to use for defensive cybersecurity work.

Fable 5, released just three days ago, was Anthropic’s answer to the obvious commercial pressure: a version of Mythos fitted with guardrails that block responses in high-risk areas like cybersecurity and biology, making it safe enough for general release, the company argued. It was immediately the most capable AI model available to the public, according to benchmark tests from Vals AI, a company that tracks AI tech performance.

The government’s directive is framed as an export control action, restricting foreign national access to the models. But in a lengthy blog post , Anthropic says its understanding is that the underlying concern is a claimed jailbreak of Fable 5. So far, the company says, the government has provided only verbal evidence of a “potential narrow, non-universal jailbreak” — one that, as Anthropic describes it, amounts to prompting the model to read a specific codebase and identify software flaws. And by the way, adds the company, it’s a “level of capability” that’s already widely available in other publicly accessible models, including OpenAI’s GPT-5.5. It’s also used routinely by cybersecurity professionals for defensive purposes, says Anthropic.

Anthropic’s broader argument is that its strongest safeguards operate through independent classifier systems that function separately from the model itself, meaning that even if someone convinces Fable to keep talking past a refusal, the underlying protections against the most dangerous outputs remain in place.

Clearly, none of that was enough to stop the government from acting, and Anthropic isn’t hiding its frustration. “We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people,” the company wrote. “If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers.”

Anthropic is widely expected to pursue an IPO this year and has staked much of its public identity on being the safety-conscious alternative to its rivals. The irony isn’t lost on observers that the very caution Anthropic displayed in restricting Mythos — which it promoted as a model so dangerous it couldn’t be released publicly — has now apparently attracted exactly the kind of government scrutiny that could disrupt its business most.

OpenAI’s Sam Altman must be enjoying this, at least. In April, he told podcaster Ashlee Vance that Anthropic’s handling of Mythos amounted to “ fear-based marketing .” “It is clearly incredible marketing to say, ‘We have built a bomb. We were about to drop it on your head. We will sell you a bomb shelter for $100 million,’” Altman said. Altman, whose company is also widely expected to pursue an IPO as soon as possible, didn’t predict a government shutdown, but he identified something that has come back to bite Anthropic for now, which is that when you spend months telling the world your AI is uniquely dangerous, the world — the U.S. government included — tends to listen.

When you purchase through links in our articles, we may earn a small commission . This doesn’t affect our editorial independence.

Get an inside look at what it takes to scale and succeed from leaders at Mach Industries, Founders Fund, and Shinkei Systems. Through candid fireside chats and high-impact networking, you’ll walk away with valuable insights and new connections.

Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world Marina Temkin

Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world

Jeff Bezos’s Prometheus raises $12B to build an ‘artificial general engineer’ for the physical world

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable Lorenzo Franceschi-Bicchierai

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

Cybersecurity researchers aren’t happy about the guardrails on Anthropic’s Fable

Google just fired a warning shot in the AI subscription price wars Lucas Ropek Connie Loizos

Google just fired a warning shot in the AI subscription price wars

Google just fired a warning shot in the AI subscription price wars

WWDC 2026: Everything announced on Siri AI, iOS 27, Apple Intelligence, and more Morgan Little Aisha Malik

WWDC 2026: Everything announced on Siri AI, iOS 27, Apple Intelligence, and more

WWDC 2026: Everything announced on Siri AI, iOS 27, Apple Intelligence, and more

Anthropic’s Claude Fable 5 is a version of Mythos the public can access today Rebecca Bellan

Anthropic’s Claude Fable 5 is a version of Mythos the public can access today

Anthropic’s Claude Fable 5 is a version of Mythos the public can access today

It’s not FAANG anymore. It’s MANGOS. Julie Bort

Microsoft’s open source tools were hacked to steal passwords of AI developers Zack Whittaker

Microsoft’s open source tools were hacked to steal passwords of AI developers

Microsoft’s open source tools were hacked to steal passwords of AI developers

Pontos-chave

  • A suspensão dos modelos de IA da Anthropic destaca a necessidade de um equilíbrio entre inovação e segurança na regulação de tecnologias emergentes.
  • O Brasil pode se inspirar na situação da Anthropic para desenvolver um arcabouço regulatório que promova a inovação sem comprometer a segurança nacional.
  • A resposta da Anthropic sugere que as preocupações governamentais podem estar baseadas em percepções desatualizadas sobre as capacidades da IA.

Análise editorial

A decisão do governo dos EUA de suspender o acesso aos modelos Claude Fable 5 e Claude Mythos 5 da Anthropic levanta questões cruciais sobre a regulação da inteligência artificial, especialmente em um momento em que o Brasil está se esforçando para estabelecer suas próprias diretrizes em tecnologia. A medida reflete uma crescente preocupação com a segurança nacional em relação a modelos de IA que possuem capacidades avançadas de identificação de vulnerabilidades, o que pode ter implicações diretas para empresas brasileiras que utilizam ou desenvolvem tecnologias semelhantes. O Brasil, que busca fomentar um ambiente de inovação, pode se ver pressionado a adotar regulamentações mais rigorosas para garantir a segurança cibernética e a proteção de dados, o que poderia impactar o ritmo de desenvolvimento de soluções de IA no país.

Além disso, a resposta da Anthropic sugere um descompasso entre as preocupações do governo e a realidade do uso de IA no mercado. A alegação de que a capacidade de "jailbreak" é uma preocupação não universal e já presente em outros modelos de IA amplamente utilizados, como o GPT-5.5 da OpenAI, indica que a regulação pode estar se baseando em percepções desatualizadas ou exageradas. Para o ecossistema de tecnologia brasileiro, isso pode servir como um alerta sobre a necessidade de um diálogo mais próximo entre desenvolvedores, reguladores e a comunidade de segurança cibernética para garantir que as políticas sejam baseadas em evidências e não em suposições.

O que observar a seguir é como a Anthropic e outras empresas de IA responderão a essa situação. A possibilidade de um apelo ou uma reavaliação da decisão governamental pode abrir espaço para discussões sobre a transparência na regulação de IA. Para o Brasil, acompanhar esses desdobramentos pode fornecer insights valiosos sobre como construir um arcabouço regulatório que não apenas proteja a segurança nacional, mas também promova a inovação e a competitividade no setor de tecnologia. Além disso, a situação destaca a importância de se ter um entendimento claro das capacidades e limitações dos modelos de IA, a fim de evitar decisões precipitadas que possam sufocar o progresso tecnológico.

Por fim, a situação da Anthropic pode ser vista como um reflexo das tensões entre inovação e segurança. À medida que as tecnologias de IA continuam a evoluir, será essencial que tanto os reguladores quanto as empresas encontrem um equilíbrio que permita o avanço tecnológico sem comprometer a segurança pública. Essa é uma lição que o Brasil pode levar em consideração ao desenvolver suas próprias políticas de IA, garantindo que não se perca a oportunidade de se tornar um líder em tecnologia na América Latina.

O que esta cobertura entrega

  • Atribuicao clara de fonte com link para a publicacao original.
  • Enquadramento editorial sobre relevancia, impacto e proximos desdobramentos.
  • Revisao de legibilidade, contexto e duplicacao antes da publicacao.

Fonte original:

TechCrunch AI

Sobre este artigo

Este artigo foi curado e publicado pelo AIDaily como parte da nossa cobertura editorial sobre desenvolvimentos em inteligência artificial. O conteúdo é baseado na fonte original citada abaixo, enriquecido com contexto e análise editorial. Ferramentas automatizadas podem auxiliar tradução e estruturação inicial, mas a decisão de publicar, a revisão factual e o enquadramento de contexto seguem responsabilidade editorial.

Saiba mais sobre nosso processo editorial