AI Startups

Patronus AI lands $50M to build ‘digital worlds’ that stress-test AI agents

Published byAIDaily Editorial Team
4 min read
Original source author: Marina Temkin

Agent-testing startup Patronus AI, founded by former Meta AI researchers, is experiencing nearly insatiable demand, its investor says.

Share:

AI agents are becoming more sophisticated. They are evolving from answering questions to autonomously executing multi-step complex tasks.

But before these agents can be trusted to book trips or conduct financial analysis on behalf of users, model providers and the startups building such agents want to ensure that they perform reliably across a vast range of scenarios.

AI labs often use benchmarks to show off their model’s prowess, but a high score, even on an agent-oriented benchmark, doesn’t actually prove that an AI can accomplish various complex, real-world jobs correctly.

Patronus AI , a startup founded in 2023 by former Meta AI researchers Anand Kannappan and Rebecca Qian, is helping model makers and companies fine-tune models to do just that by building simulated digital environments in which to evaluate the agents’ performance.

The San Francisco-based startup must be solving an important problem. Virtually every frontier AI lab and many emerging startups are now customers, according to Glenn Solomon, a managing director at Notable Capital, who describes demand for the company’s simulated environments as nearly insatiable.

Patronus’ revenue has grown 15-fold over the past year, fueling significant investor interest. On Thursday, the company announced a $50 million Series B round led by Greenfield Partners, with participation from Notable Capital, Lightspeed, Datadog, and Samsung. The round brings the company’s total funding to $70 million.

Patronus uses what it calls “digital world models” to create replicas of websites and internal systems. In these environments, agents are stress-tested after training using reinforcement learning, which iteratively rewards successful task completion and penalizes errors.

AI labs see great value in these digital simulations because they give agents a chance to try different, sometimes unpredictable, scenarios. The company compares its approach to how Waymo trained autonomous cars by first building synthetic worlds to test vehicles against rare hazards, such as severe weather or a child running after a ball.

The difference with AI agents is that they tend to take shortcuts, which means they fail to complete the task correctly. “Patronus is really good at spotting the hacks and making sure they are holding the models accountable,” Solomon said.

Patronus is currently providing its simulated digital worlds for software engineering and finance, but these are just the start, according to Kannappan.

“Today we’re very focused on the problems that are verifiable, so the problems that you can immediately check and verify, but there are a ton more areas that are very non-verifiable or very hard to verify,” he said.

Just because these processes are verifiable doesn’t mean they are simple. “We want to be able to actually create the environment in which you can operate an agent that can run for 10 hours or 10 days or 10 weeks,” Kannappan said.

As for rivals, Patronus believes it is primarily competing against the internal teams AI labs have already built to evaluate agent behavior. While human-data firms like Mercor and Surge help model makers with reinforcement learning, Patronus operates differently by evaluating how agents behave without any human involvement.

When you purchase through links in our articles, we may earn a small commission . This doesn’t affect our editorial independence.

Marina Temkin is a venture capital and startups reporter at TechCrunch. Prior to joining TechCrunch, she wrote about VC for PitchBook and Venture Capital Journal. Earlier in her career, Marina was a financial analyst and earned a CFA charterholder designation.

You can contact or verify outreach from Marina by emailing marina.temkin@techcrunch.com or via encrypted message at +1 347-683-3909 on Signal.

Last chance to save up to $190 on TechCrunch Founder Summit. Join 1,000+ founders and VCs at all stages for real-world scaling insights and connections that move the needle. Savings end June 26, 11:59 p.m. PT .

Former Infosys chief has a new startup that wants to challenge the IT services world Jagmeet Singh

Former Infosys chief has a new startup that wants to challenge the IT services world

Former Infosys chief has a new startup that wants to challenge the IT services world

OpenAI unveils its first custom chip, built by Broadcom Russell Brandom

OpenAI unveils its first custom chip, built by Broadcom

OpenAI unveils its first custom chip, built by Broadcom

HaloBraid raises $7M from Seven Seven Six to end the six-hour hair salon appointment Dominic-Madori Davis

HaloBraid raises $7M from Seven Seven Six to end the six-hour hair salon appointment

HaloBraid raises $7M from Seven Seven Six to end the six-hour hair salon appointment

WhatsApp gets new chief as Meta taps India’s CRED founder Kunal Shah and invests $900M in startup Jagmeet Singh

WhatsApp gets new chief as Meta taps India’s CRED founder Kunal Shah and invests $900M in startup

WhatsApp gets new chief as Meta taps India’s CRED founder Kunal Shah and invests $900M in startup

Beyond Siri: Here are the practical AI features coming to your iPhone in iOS 27 Sarah Perez

Beyond Siri: Here are the practical AI features coming to your iPhone in iOS 27

Beyond Siri: Here are the practical AI features coming to your iPhone in iOS 27

Every new iOS 27 feature that’s worth knowing about Lauren Forristal

Every new iOS 27 feature that’s worth knowing about

Every new iOS 27 feature that’s worth knowing about

Aura’s impressive e-ink photo frame doesn’t even look digital Amanda Silberling

Aura’s impressive e-ink photo frame doesn’t even look digital

Aura’s impressive e-ink photo frame doesn’t even look digital

Key takeaways

  • Patronus AI highlights the growing demand for testing environments for AI agents, reflecting a critical need in the development of reliable technologies.
  • Patronus's exponential revenue growth may inspire Brazilian startups to invest in similar solutions, especially in sectors like agribusiness and finance.
  • The approach of simulating digital worlds could become a trend in AI development, increasing the quality and safety of models.

Editorial analysis

The $50 million funding raised by Patronus AI highlights the growing demand for solutions that ensure the reliability of AI agents in complex scenarios. For the Brazilian tech sector, this represents an opportunity for learning and adaptation, especially considering that Brazil hosts an expanding AI startup ecosystem. The need for simulated testing environments, like those offered by Patronus, could inspire local initiatives to develop similar solutions focused on specific sectors of the Brazilian economy, such as agribusiness and finance.

Moreover, Patronus's approach, which utilizes digital world models to stress-test AI agents, may indicate a new trend in AI development: creating testing environments that simulate real-world complexity. This could lead to an increase in the quality of developed models, which in turn may accelerate AI adoption in critical sectors of the economy. Brazilian startups operating in areas such as healthcare and logistics could greatly benefit from such innovations.

The exponential revenue growth of Patronus, which increased 15-fold in one year, is a clear signal that the market is recognizing the importance of ensuring the effectiveness and safety of AI agents. For Brazil, this could mean increased investment in research and development, as well as greater collaboration between universities and companies to create solutions that meet local demands. What to watch next is how Brazilian startups will respond to this demand for testing environments and whether they will be able to create their own versions of these innovative solutions.

Finally, the participation of renowned investors, such as Samsung and Lightspeed, could open doors for international partnerships and collaborations that benefit the Brazilian ecosystem. The exchange of knowledge and technology between companies from different countries could accelerate the development of AI solutions that meet the specific needs of the Brazilian market, fostering a more robust and innovative environment for the tech sector in the country.

What this coverage includes

  • Clear source attribution and link to the original publication.
  • Editorial framing about relevance, impact, and likely next developments.
  • Review for readability, context, and duplication before publication.

Original source:

TechCrunch AI

About this article

This article was curated and published by AIDaily as part of our editorial coverage of artificial intelligence developments. The content is based on the original source cited below, enriched with editorial context and analysis. Automated tools may assist with translation and initial structuring, but publication decisions, factual review, and contextual framing remain editorial responsibilities.

Learn more about our editorial process