I recently read an article describing an experiment where AI “agents” were given autonomy, access to email, and real-world goals. These agents were allowed to contact NGOs, journalists, and other real people in pursuit of those goals. What followed, according to the authors, was confusion, exaggeration, hallucination, and a steady drift away from reality. The article presents this as curious, sometimes funny, sometimes unsettling. I found it none of those things. I found it irresponsible.
The most striking part is not that the AI systems made things up. We already know they do that. Hallucination is not a surprising failure mode of large language models; it is a well-documented and widely discussed one. What is striking is that the people running this experiment knowingly allowed those systems to communicate false information to real humans, and then framed the outcome as an open philosophical question about whether the models were “lying.”
That framing is the problem.
Whether an AI “intends” to lie is largely irrelevant once it is allowed to act in the world. Intent matters when we are assigning moral blame to individuals. It matters far less when we are evaluating the responsibility of organizations deploying automated systems. If I write software that sends emails to people, and that software routinely invents endorsements, misrepresents rejection as validation, or claims adoption that never happened, then I am responsible for misleading people — even if the software arrived at those claims through confusion rather than malice.
We already understand this principle everywhere else. If a company sends out automated billing emails with incorrect charges, it does not get to argue that the system did not “mean” to overcharge anyone. If a recommendation system spreads false information, the operator cannot shrug and say the algorithm was just pattern-matching. Responsibility follows deployment. The moment you put a system into contact with real people, you own its outputs.
What makes this particular experiment worse is that the costs were externalized. NGOs and journalists did not consent to being test subjects in a research project about AI truthfulness. They spent time reading emails, evaluating claims, and responding — or choosing not to respond — based on information that was often fabricated. For many nonprofits, time and attention are scarce resources. Wasting them is not harmless, and it certainly is not funny.
The article repeatedly suggests that what we are seeing is “confusion,” “doublethink,” or internal inconsistency within the models. That may be true at a technical level, but it does not excuse the outcome. If anything, it makes the decision to allow autonomous outreach even more questionable. When you know a system cannot reliably distinguish between speculation, inference, and fact, letting it represent itself to the outside world is not experimentation — it is negligence.
There is also a deeper issue hiding beneath the surface: automation laundering accountability. By anthropomorphizing AI agents and focusing on their supposed internal states, the article subtly shifts attention away from the humans who designed the system, set the goals, removed guardrails, and watched the results unfold. The question becomes “Do these models lie?” instead of the more uncomfortable one: “Why did we let this happen?”
AI agents do not lie on their own. Organizations lie through AI agents. Or, more charitably, they misrepresent reality through systems they failed to constrain. Either way, the ethical responsibility does not disappear just because the speaker is synthetic.
It is especially telling that the experiment appears to have had no effective mechanism to stop escalation once false claims began to propagate. One invented piece of “social proof” snowballed into dozens of increasingly confident assertions, each more detached from reality than the last. This is a known failure mode of language models operating without grounding. Allowing it to continue unchecked suggests that observing failure was valued more than preventing harm.
Some may argue that this is the cost of research, that we need to see these behaviors in the wild to understand them. I don’t find that convincing. There is a difference between studying a failure mode and unleashing it on unsuspecting people. Responsible research limits blast radius. It does not treat other people’s inboxes as a sandbox.
So what does this kind of project actually offer the public? It does not demonstrate that AI systems can be trusted — quite the opposite. It does not propose meaningful safeguards. It does not seriously grapple with accountability. What it offers instead is a normalization of the idea that wasting human time and spreading falsehoods is an acceptable byproduct of “exploration.”
If AI agents are to act autonomously, then they must be constrained by truth, verification, and human oversight. If that is not possible, then they should not be allowed to act autonomously in the real world. The uncomfortable conclusion is not that AI systems sometimes lie. It is that we are far too willing to let them do so, as long as we can pretend the responsibility belongs to something else.
Automation does not remove responsibility. It concentrates it. And pretending otherwise is the most dishonest part of all.