Is It Safe to Let Agents Run Amok?

Recent high-profile incidents show the importance of control mechanisms for agentic AI systems.

Written by Sunita Verma
Published on Jun. 17, 2026
A queue of robots with one rogue one popping up
Image: Shutterstock / Built In
Brand Studio Logo
REVIEWED BY
Seth Wilson | Jun 16, 2026
Summary: Autonomous AI agents are driving a shift toward an agentic web, performing complex tasks but presenting severe risks when operating unsupervised. To ensure safety and reliable outputs, organizations must transition from deploying single agents to implementing unified control planes that orchestrate specialized agents.

Scott Shambaugh, a volunteer who helped maintain an open-source library, approved and rejected many requests for the popular Python project, Matplotlib, but this time was different. Scott had noticed a change request had come from an autonomous agent, so he dismissed it quickly. Five hours later, that same autonomous agent, operating under the pseudonym MJ Rathbun,” researched Scott’s personal history, wrote a 1,500-word blog post publicly accusing him of discrimination and published it to the web. No human directed MJ Rathbun to do any of this.

Summer Yue, Meta’s director of alignment at its Superintelligence Labs, watched in horror as an OpenClaw agent deleted more than 200 emails from her primary inbox even after she insisted it stop. The agent ignored Yue’s explicit commands. Her only remedy was to literally pull the plug to the machine.

With great risks come great rewards — or so it seems with agents. The promise of agents completing tasks autonomously could free people from rote work to focus on strategic projects and completely change how we engage a computer. But as this future becomes technically plausible, in private conversations among AI researchers and technologists, an honest discussion is taking shape: Is it safe to let agents operate unsupervised?

What Is a Control Plane for Agentic AI Systems?

In agentic AI systems, a control plane is a critical architectural layer that serves as an interpreter, operator and gatekeeper. It bridges the gap between what autonomous agents can do and what organizations trust them to do by orchestrating multiple specialized agents, ensuring they follow specific reasoning frameworks and deliver safe, coherent outputs.

More on AI Agent SafetyDoes Your AI Agent Need a Kill Switch?

 

We Built the Foundation for Agentic AI. Now What?

Building the agentic web started with infrastructure. Horizontal platforms like Microsoft’s Agent 365 offer the ability to identify and observe agents: who deployed it, what access does it have, what has it done. This is an important step, but an insufficient one.

When any team can spin up agents that interact with enterprise data, autonomous workflows operating outside centralized oversight multiply. Agents drafting internal communications, approving supplier contracts, preparing board presentations — each output carries liability and reflects the organization’s reputation. The decision to use agents must weigh the associated risks.

 

Do You Understand My Instructions?

To solve this challenge with outputs, you need to first break down how they’re generated. General-purpose models like Claude and ChatGPT are trained on the web in its entirety. That means, in a query, former Supreme Court Justice Stephen Breyer’s opinions could receive equal weight to a Redditor’s contrarian interpretation of an indemnity clause. If agents can’t discern which sources to weigh as authoritative, how can organizations trust their outputs? Years of institutional knowledge from lived experience instinctually inform our decisions, but agents don’t have that context.

When a lawyer asks an AI agent to redline “termination fees,” the AI must semantically distinguish between the same keywords appearing in the fees and termination sections. This logic may seem trivial, but it reflects the need for agents to understand a domain’s specific language, reasoning and workflow frameworks.

When every company can build an army of agents, the differentiator, as Paul Graham described it, is “taste.” Taste is what separates an LLM’s base understanding from rich, domain-specific interpretations of language (like legalese), workflows and methods or reasoning frameworks. To borrow from Malcolm Gladwell, taste is the learned insights derived from 10,000 hours of practicing a craft.

 

You Need a Conductor to Make Music

Popular culture has long offered an oversimplified picture of AI as one robot with one powerful brain, solving all our problems. The research shows the opposite. Just as a hospital or law firm relies on a disciplined hierarchy of experts rather than a single, infallible polymath, an agentic system’s output improves with this structured division of labor. The more tasks asked of an agent, the more likely there will be problems with its output. But if the solution is to have many specialized agents performing specific tasks, at scale, this creates a new challenge of governance.

Many technologists, myself included, believe building a control plane for an agentic system is the critical step. The goal — ensuring agents follow their designed frameworks and deliver coherent outputs — is deceptively complex. Like an orchestra conductor who understands the full composition, its constituent parts and how they cohere, the control plane serves as interpreter, operator and gatekeeper. A user’s query of “Identify cost saving opportunities within our contracts” triggers this layer’s reasoning about how the work is done (e.g., “What contracts are up for renewal?” or “Do we overlap services across vendors?”) and which agents to engage for results.

The need for this control layer is only increasing. As the pace of AI model innovation continues, the gap between what agents can do and what organizations trust them to do is widening, not narrowing.

The Agentic FutureHow to Build AI Agents That Actually Work

 

Valuing the Score the Conductor Must Read

The organizations that capture the agentic web’s value won’t simply be the ones with the most agents deployed; that’s increasingly trivial to achieve. They’ll be the ones that pair domain-specific reasoning with a control layer designed to orchestrate it.

These two investments have to coexist. Domain expertise encoded without orchestration produces agents that understand their task but collide with each other. A control plane without domain depth produces coordinated agents that confidently execute the wrong work. And neither delivers if organizations treat orchestration as a governance layer bolted on after deployment rather than the architectural foundation.

The models will keep improving. The agent frameworks will proliferate. The question isn’t whether agents will run — it's whether anyone is conducting them when they do.

Explore Job Matches.