AI-Generated Code Is Creating a New Kind of Safety Risk

The danger of AI coding tools isn’t producing bad code. It’s the false confidence they instill.

Written by Eric Ries
Published on Jun. 02, 2026
The Chernobyl site
Image: Shutterstock / Built In
Brand Studio Logo
REVIEWED BY
Seth Wilson | Jun 01, 2026
Summary: Vibe coding practices create a dangerous perception gap. A METR study found developers using AI tools felt 24 percent faster but were actually 19 percent slower. This false confidence leads to skipped reviews and a loss of mental ownership over complex, unread architectures.

In February, a security firm discovered that Moltbook, a social network for AI agents, had exposed its entire database on the internet. Anyone could read 1.5 million authentication tokens, 35,000 email addresses and every private message on the platform. The vulnerability was trivial, just a misconfigured database setting that any competent developer would have caught. But Moltbook didn’t have developers. The founder built the entire platform by describing what he wanted and letting an AI write every line. He never even read the code. Nobody did.

Moltbook is essentially a toy, so the breach was embarrassing, not dangerous. But the practice that produced it is spreading to systems where failure is catastrophic.

The industry calls it vibe coding: You describe what you want in plain English, and an AI generates it. A majority of developers now admit to using AI-generated code they dont fully understand,  a practice that’s accelerating. And what makes it dangerous is not the quality of the code, but what it does to the person writing it.

I co-founded an AI research lab. I use AI tools to write code every day. I am not against AI-assisted development. And I’m telling you that vibe coding is creating a generation of operators who cannot accurately assess their own understanding of the systems they’re building. The bugs will improve, but the false confidence won’t.

Is AI-Generated Code Safe?

AI-generated code poses a significant safety risk because it triggers dark flow, a state of momentum where developers rapidly ship features without reading the code or making architectural choices. This creates a dangerous perception gap where developers operate with false confidence, accepting tighter deadlines and skipping critical security reviews for systems they dont fully understand and cant accurately debug.

Safer Vibe Coding PracticesWhy Spec-Driven Development is the Future of AI-Assisted Software Engineering

 

The Dark Flow Problem

I know because I’ve felt the pull myself. My colleague Rachel Thomas has a name for it: dark flow. You describe a feature. Working code appears. You describe the next feature. More code. It works too. You feel the momentum, and you feel efficient. The research organization METR measured this sequence of events in a randomized controlled trial, finding that experienced developers using AI tools estimated they were 24 percent faster. But they were actually 19 percent slower — a 43-point gap between what they felt and what really happened. 

Other studies report productivity gains, and those studies may be right. But the perception gap METR documented — people who cannot accurately judge their own performance — is consistent across nearly all of them. Even after seeing the data, many developers in the study didn't believe it applied to them.

That perception gap is a safety problem. A developer who believes they’re 24 percent faster will take on more ambitious projects, accept tighter deadlines and skip reviews they would normally have done at their perceived slower pace. They’re making risk calculations based on a capability they don’t have, and they can’t be corrected because the feeling is more convincing than the evidence.

Dark flow does something else that matters for safety. In the momentum of generating feature after feature, the developer stops making architectural choices. The AI suggests an approach and because it feels efficient, they take it. The next step it suggests feels right too, and on it goes, each suggestion pulling the project toward whatever the model finds statistically likely. By the time the developer compares what they've built to what they intended, the codebase is the AI’s architecture, not theirs. They have lost their mental model of the system without noticing it happen.

A security expert will reasonably object that software has always been beyond full human understanding. The XZ Utils backdoor was planted by a human in a library millions of systems depended on, and nobody caught it for nearly three years. Developers have relied on opaque dependencies for decades. What makes this different  is the boundary of ignorance. A developer who calls XZ Utils knows they didn’t write it. They have a clear line between code they understand and code they’re trusting someone else to understand.

A vibe coder has no such line. The AI-generated code is in their codebase, merged into their project and attributed to their commits. They built the application, watching it come together as dark flow gave them the experience of understanding it feature by feature. But they never read the code or traced the logic, so they don’t comprehend why the AI made the choices it made. They don’t know where their understanding ends and the AI’s begins. The XZ Utils developer knew they were trusting a black box, but the vibe coder doesn’t even know they’re in one.

 

The Costs of AI Failure

AI-generated code is improving rapidly. Month by month, the models produce fewer vulnerabilities, handle more edge cases, and write more secure defaults. If the code gets good enough, does the human-understanding problem become irrelevant? After all, we took aspirin for almost 100 years before we understood how it worked. Maybe we don’t need to understand the code as long as it works.

But aspirin’s failure modes are known and bounded. Its side effects, interactions and possible allergic reactions have been documented over a century of use. Aspirin also doesn’t modify itself, interact with adversaries, or make the patient believe they understand pharmacology.

Code does all three. It changes with every prompt, and its failure modes are combinatorial and adversarial. And you can’t judge the results the way you can judge aspirin’s since code that passes every test can still harbor security vulnerabilities, race conditions and edge cases that surface only under adversarial conditions or unusual load. Each AI-generated codebase is novel. Its failure modes haven’t been mapped by a century of observation and will only be discovered in production.

Dark flow makes this worse, not better, as the code improves. The more reliable the AI, the deeper the developer’s trust, the less they review, and the longer they go before encountering a failure they can’t diagnose. Improving code quality without fixing the human-factors problem extends the period of false confidence. 

How to Use AI ToolsShould You Be Vibe Coding?

 

AI Is Facing Its Chernobyl Moment

The story of Chernobyl is an apt parallel. The RBMK reactor at Chernobyl was believed to be reliable, so its operators felt comfortable disabling the safety systems, including an automatic shutdown system, in order to perform a test.

It turned out that the reactor had a design flaw. But the design flaw alone wouldn’t have caused the disaster. It took operators who were confident they understood a machine theyd never actually operated in the conditions they were testing. They were unable to diagnose what went wrong when the system behaved in ways they didn’t anticipate.

The human-factors problem will not fix itself. A developer in dark flow using a model that produces flawless code 99 percent of the time is dangerous, precisely because of their overconfidence. They may find themselves in a production crisis on a system they can’t debug at two in the morning using code they’ve never read.

We still have a choice — not about whether to use AI in development, but about whether we build tools that grow the developer’s understanding alongside the codebase or tools that let them ship systems they’ve never examined. The first approach is harder to sell. It’s slower, and doesn’t produce the thrill of generating an application in a single afternoon.

But it does produce developers who know what they’ve built. And when the system fails, because all systems eventually fail, it ensures the problem will be handled by  a team that can figure out why.

Eric Ries is the author of the new book Incorruptible: Why Good Companies Go Bad ... and How Great Companies Stay Great.

Explore Job Matches.