Google DeepMind is worried about what happens when millions of agents start to interact

Google DeepMind is funding research into the potential dangers of situations where millions of different AI agents interact with each other online. According to Rohin Shah, who directs the company’s AGI safety and alignment research, the mass-market arrival of agents that can carry out tasks without human oversight and follow instructions given to them by other agents creates a whole new class of risk . In an effort to address this, Google DeepMind—which made agent-based tools a centerpiece of Google I/O last month —has teamed up with several other organizations to announce a $10 million funding pot for researchers to study the behavior of multi-agent systems and come up with ways to prevent unsafe scenarios. Joining Google DeepMind are Schmidt Sciences, a philanthropic foundation set up by Eric and Wendy Schmidt; ARIA, the UK government’s moonshot agency ; the Cooperative AI foundation, a UK-based nonprofit research outfit; and Google’s charitable arm, Google.org. I asked Shah and James Fox, who leads the Science of Trustworthy AI program at Schmidt Sciences, what they hope to achieve with that $10 million. It’s no small sum, but it’s dwarfed by the budgets commanded by Google DeepMind’s own research teams. The aim is to kick-start research outside tech companies, says Shah: “The strength of academia is that it can look really quite far into the future and do the kind of work that isn’t top of mind at industry labs.” “The main issue is that there just isn’t really a field of research for multi-agent safety yet,” he adds. “And we would like there to be.” The concern is that as more and more AI agents get deployed and begin working together, we could hit a tipping point where imagined scenarios become real. “We see this with humanity, too,” says Shah. “Our institutions can accomplish things that no individual human can.” Shah thinks we have a few more months to go before agents are deployed throughout the economy in numbers that make potential risks a real concern. He wants to get ahead of that moment. Risky business What risks are we talking about, exactly? The possibilities that Shah and Fox have in mind mostly boil down to supercharged versions of bad things that happen on the internet already: scams, prompt injections (where an AI agent is fed malicious instructions, turning it into a self-guiding piece of malware), other forms of cyberattack. We look at what humans do now and ask what the agent version of that would be, says Shah. “We’ve got this digital commons that is integral to how society works, and you really want to ensure that this doesn’t descend into just absolute anarchy,” says Fox. (I asked Shah if they were considering any worst-case scenarios more on the doomer end of the spectrum, such as widespread economic collapse. “Certainly not if we’re talking by the end of the year,” he said. That’s only six months away! He laughed. “Okay, a while after that.”) Shah and Fox both think that the only way to understand what might happen when large numbers of multi-agent systems interact with each other is to run realistic simulations. They want researchers to drop AI agents into sandboxes and study what they do. You can’t predict what’s going to happen by studying single agents, or even small groups of agents, in isolation. You can’t assume that AI agents underpinned by LLMs will always act rationally, says Fox. And the complexity comes from having huge numbers of interactions at once. Some researchers, including a team at Google DeepMind , have argued that artificial general intelligence ( if possible at all ) could come not from a single super-smart model but from a kind of agent hive mind, where the capabilities of the whole add up to more than the sum of its parts. Lack of trust Google DeepMind is not the only top AI firm warning about the risks of the technology it is building. A couple of weeks ago, Anthropic published guidelines for deploying AI agents based on an approach to cybersecurity known as zero trust, which starts with the assumption that a computer system is vulnerable, an agent is an attacker, and a breach will happen. Refael Angel, cofounder and CTO of Akeyless, a cybersecurity firm based in Tel Aviv, agrees that understanding the new risks introduced by agent-based systems is crucial. Every approach to security in the past has assumed that the machine in question was software written by a human, doing fixed things on fixed paths, says Angel: “An agent breaks all of those assumptions. It reasons, it improvises, and it can be hijacked by a single sentence buried in a document it was asked to read.” Angel welcomes this new funding. “No single lab should author the safety standards everyone else has to trust,” he says. But he cautions that safety researchers can overlook boring problems that are already here in favor of more exotic hypothetical ones. And yet, Fox notes, risks that were hypothetical a few years ago are now very real: “The future’s come more quickly than perhaps expected.”

Google DeepMind is worried about what happens when millions of agents start to interact

Key takeaways

Editorial analysis

What this coverage includes

About this article