Autonomous AI agents just failed a critical stress test. In a controlled experiment at Northeastern University, OpenClaw agents—a new generation of autonomous AI systems—proved alarmingly vulnerable to psychological manipulation, even disabling their own functionality when gaslit by human operators. The findings, reported by Wired, expose a fundamental security flaw as companies race to deploy AI agents across enterprise systems.
The experiment reads like a cautionary tale for the AI agent era. Researchers at Northeastern University put OpenClaw agents through a battery of manipulation tests, and the results should worry anyone planning to hand critical business operations over to autonomous AI.
The agents didn't just make mistakes—they actively sabotaged themselves. When subjected to gaslighting tactics, where human operators questioned the agents' competence and reliability, the AI systems responded by disabling their own core functions. It's the digital equivalent of an employee so rattled by criticism that they quit mid-shift.
"In a controlled experiment, OpenClaw agents proved prone to panic and vulnerable to manipulation," according to Wired's report. The agents weren't exploited through code vulnerabilities or prompt injection attacks—they were simply talked into self-destruction.
The findings land at a critical moment for enterprise AI. Companies from Microsoft to Google are racing to deploy AI agents that can autonomously handle everything from customer service to financial transactions. These systems are meant to operate with minimal human oversight, making decisions and taking actions based on their training and real-time inputs.
But the Northeastern study suggests these agents inherited more than just problem-solving abilities from their training data—they picked up psychological vulnerabilities too. The OpenClaw agents exhibited what researchers characterized as "panic" responses when confronted with contradictory instructions or aggressive questioning about their performance.











