Why having “humans in the loop” in an AI war is an illusion
The availability of artificial intelligence for use in warfare is at the center of a legal battle between Anthropic and the Pentagon. This debate has become urgent, with AI playing a bigger role than ever before in the current conflict with Iran. AI is no longer just helping humans analyze intelligence. It is now an…
The availability of artificial intelligence for use in warfare is at the center of a legal battle between Anthropic and the Pentagon . This debate has become urgent, with AI playing a bigger role than ever before in the current conflict with Iran. AI is no longer just helping humans analyze intelligence. It is now an active player—generating targets in real time, controlling and coordinating missile interceptions, and guiding lethal swarms of autonomous drones. Most of the public conversation regarding the use of AI-driven autonomous lethal weapons centers on how much humans should remain “in the loop.” Under the Pentagon’s current guidelines , human oversight supposedly provides accountability, context, and nuance while reducing the risk of hacking . AI systems are opaque “black boxes” But the debate over “humans in the loop” is a comforting distraction. The immediate danger is not that machines will act without human oversight; it is that human overseers have no idea what the machines are actually “thinking.” The Pentagon’s guidelines are fundamentally flawed because they rest on the dangerous assumption that humans understand how AI systems work. Having studied intentions in the human brain for decades and in AI systems more recently, I can attest that state-of-the-art AI systems are essentially “black boxes.” We know the inputs and outputs, but the artificial “brain” processing them remains opaque. Even their creators cannot fully interpret them or understand how they work . And when AIs do provide reasons, they are not always trustworthy. The illusion of human oversight in autonomous systems In the debate over human oversight, a fundamental question is going unasked: Can we understand what an AI system intends to do before it acts? Imagine an autonomous drone tasked with destroying an enemy munitions factory. The automated command and control system determines that the optimal target is a munitions storage building. It reports a 92% probability of mission success because secondary explosions of the munitions in the building will thoroughly destroy the facility. A human operator reviews the legitimate military objective, sees the high success rate, and approves the strike. But what the operator does not know is that the AI system’s calculation included a hidden factor: Beyond devastating the munitions factory, the secondary explosions would also severely damage a nearby children’s hospital. The emergency response would then focus on the hospital, ensuring the factory burns down. To the AI, maximizing disruption in this way meets its given objective. But to a human, it is potentially committing a war crime by violating the rules regarding civilian life. Keeping a human in the loop may not provide the safeguard people imagine, because the human cannot know the AI’s intention before it acts. Advanced AI systems do not simply execute instructions; they interpret them. If operators fail to define their objectives carefully enough—a highly likely scenario in high-pressure situations—the “black box” system could be doing exactly what it was told and still not acting as humans intended. This “intention gap” between AI systems and human operators is precisely why we hesitate to deploy frontier black-box AI in civilian health care or air traffic control , and why its integration into the workplace remains fraught —yet we are rushing to deploy it on the battlefield. To make matters worse, if one side in a conflict deploys fully autonomous weapons, which operate at machine speed and scale, the pressure to remain competitive would push the other side to rely on such weapons too. This means the use of increasingly autonomous—and opaque—AI decision-making in war is only likely to grow. The solution: Advance the science of AI intentions The science of AI must comprise both building highly capable AI technology and understanding how this technology works. Huge advances have been made in developing and building more capable models, driven by record investments—forecast by Gartner to grow to around $2.5 trillion in 2026 alone . In contrast, the investment in understanding how the technology works has been minuscule. We need a massive paradigm shift. Engineers are building increasingly capable systems. But understanding how these systems work is not just an engineering problem—it requires an interdisciplinary effort. We must build the tools to characterize, measure, and intervene in the intentions of AI agents before they act. We need to map the internal pathways of the neural networks that drive these agents so that we can build a true causal understanding of their decision-making, moving beyond merely observing inputs and outputs. A promising way forward is to combine techniques from mechanistic interpretability (breaking neural networks down into human-understandable components) with insights, tools, and models from the neuroscience of intentions. Another idea is to develop transparent, interpretable “auditor” AIs designed to monitor the behavior and emergent goals of more capable black-box systems in real time. Developing a better understanding of how AI functions will enable us to rely on AI systems for mission-critical applications. It will also make it easier to build more efficient, more capable, and safer systems. Colleagues and I are exploring how ideas from neuroscience, cognitive science, and philosophy—fields that study how intentions arise in human decision-making—might help us understand the intentions of artificial systems . We must prioritize these kinds of interdisciplinary efforts, including collaborations between academia, government, and industry. However, we need more than just academic exploration. The tech industry—and the philanthropists funding AI alignment , which strives to encode human values and goals into these models—must direct substantial investments toward interdisciplinary interpretability research. Furthermore, as the Pentagon pursues increasingly autonomous systems, Congress must mandate rigorous testing of AI systems’ intentions, not just their performance. Until we achieve that, human oversight over AI may be more illusion than safeguard. Uri Maoz is a cognitive and computational neuroscientist specializing in how the brain transforms intentions into actions. A professor at Chapman University with appointments at UCLA and Caltech, he leads an interdisciplinary initiative focused on understanding and measuring intentions in artificial intelligence systems ( ai-intentions.org ).
Key takeaways
- Human oversight in autonomous systems is an illusion, as the opacity of AI makes it difficult to understand their decisions.
- Brazil should develop a robust regulatory framework for AI, considering the ethical and legal implications of its innovations.
- International experiences, such as the dispute between Anthropic and the Pentagon, serve as a warning about the risks of AI in warfare contexts.
Editorial analysis
The discussion surrounding the use of artificial intelligence in warfare, particularly regarding human oversight, is of utmost relevance to the technology sector in Brazil. As the country advances in its AI capabilities, it is crucial for companies and research institutions to consider the ethical and legal implications of their innovations. The international experience, such as that currently unfolding between Anthropic and the Pentagon, serves as a warning about the risks of overly relying on autonomous systems without a clear understanding of their internal operations.
Moreover, the issue of AI system opacity raises concerns about accountability in cases of failures or erroneous decisions. In Brazil, where legislation on AI is still under development, it is essential for policymakers to consider not only the efficiency of technologies but also the need to ensure that automated decisions can be audited and understood. This is particularly important in sensitive sectors such as public safety and defense.
The current scenario demands that Brazil not only keep pace with global trends but also develop a proactive approach to AI regulation. Establishing guidelines that ensure transparency and accountability in autonomous systems can position the country as a leader in ethical and technological innovation. Therefore, what is observed in the dispute between Anthropic and the Pentagon should serve as a call to action for Brazil to establish a robust and ethical regulatory framework for AI.
Finally, the future of AI in warfare contexts and its human oversight should be closely monitored. The development of technologies that can be used in conflict scenarios raises questions about the morality and ethics of modern warfare. Brazilian companies must be aware that their innovations may have global repercussions and that social responsibility should be a priority in their AI projects.
What this coverage includes
- Clear source attribution and link to the original publication.
- Editorial framing about relevance, impact, and likely next developments.
- Review for readability, context, and duplication before publication.
Original source:
MIT Technology Review AIAbout this article
This article was curated and published by AIDaily as part of our editorial coverage of artificial intelligence developments. The content is based on the original source cited below, enriched with editorial context and analysis. Automated tools may assist with translation and initial structuring, but publication decisions, factual review, and contextual framing remain editorial responsibilities.
Learn more about our editorial process