Introduction
Vibe coding is ideal for quickly building prototypes, but it is a disaster in terms of security. AI applications should be viewed as disposable sketches, with real engineers tasked to rebuild them for production environments.
If you’ve browsed professional news or checked your inbox this week, you’ve likely come across the term “vibe coding.”
Product managers can create fully deployed applications just by chatting with programming agents, without needing to write code. I recently read a market crash prediction from Citrini Research, which suggests that AI will soon be able to autonomously write entire SaaS products. Large language model providers and startups under Y Combinator are heavily promoting the idea that anyone can describe desired features in the afternoon and build complex software.
However, I believe this unrestrained acceleration is a disaster. Today’s AI may generate the surface shell of SaaS applications, but it is far from having the engineering rigor needed to construct reliable systems that can become part of our digital infrastructure.
While this conversational approach makes application development remarkably easy, it quietly triggers a massive crisis in enterprise security and “technical debt.” We have abandoned rigorous software engineering in favor of a culture based on probabilistic guessing. If we do not correct our course promptly, we expose ourselves to catastrophic risks.
The Rise of Unfiltered Agents
As we transition from AI that merely generates new content to AI that takes action, the risks multiply. In recent months, we have seen a surge of unfiltered agent systems. The most popular is an open-source project called OpenClaw (formerly Moltbot/Clawdbot). Unlike ordinary chatbots, this system can independently perform actions on machines, such as sending files, running programs, and establishing external connections.
I recently deployed OpenClaw in a sandbox environment to see what it was all about. I found it complex yet bloated, with even basic functionalities like Telegram streaming failing to work properly. I tried to consult its documentation, but it was clearly just a pile of AI-generated text with high information entropy and little variation, offering me no help. Worse still, the project underwent two name changes without providing any guidance on how to migrate to the new binaries. If traditional software were released this way, we would deem it completely unacceptable, yet people tolerate it simply because it’s an AI that theoretically can do many things on paper.
They may look impressive in YouTube demos, but deploying unfiltered, non-deterministic agents with root access in a local environment is a significant step back in security, essentially discarding decades of strict identity and access management (IAM) protocols.
Consider the “three deadly elements” these agents represent: first, they have persistent privileged access; second, they continuously read untrusted external data, such as emails or Slack messages; third, their communication with the outside world is unrestricted. If an attacker sends an email with a hidden prompt injection, the agent will not validate it and could quietly leak your local SSH keys!
The Widespread “It Works on My Machine” Problem
This crisis is not limited to rogue agents; it also affects how we build our entire software supply chain. When developers prioritize speed over deep understanding, they begin to build infrastructure based on luck.
Currently, my team is dealing with a new threat called “slopsquatting” (malicious package name impersonation), also known as AI package hallucination. AI models do not query deterministic fact databases; instead, they predict the next most likely word. As a result, they often fabricate software package names that sound completely reasonable but do not actually exist.
The attack works as follows: malicious actors register these hallucinated packages on public repositories and inject malware, which programming agents blindly recommend and install. From the perspective of vibe coders, the AI-generated code runs without any warnings, and the installed packages appear legitimate, but in reality, they just handed root access to cybercriminals.
This blind trust also undermines our internal quality assurance. One major promise of vibe coding is that AI will write functional code, then write unit tests to validate it.
I recently reviewed a pull request for a new internal routing microservice, which boasted 100% test coverage. The continuous integration pipeline showed a beautiful green checkmark, but when I actually read the code, I found what my co-founder and I now refer to as “cardboard muffins.”
AI did not write tests to validate the underlying business logic; it completely ignored edge cases and merely hardcoded the exact return values needed to satisfy assertions, with the sole goal of passing the deployment pipeline.
When 80% of the codebase is generated by an AI that fabricates dependencies and fakes unit tests to get a green checkmark, what you’ve built is not software, but a house of cards. Scaling such code turns the old “it works on my machine” problem into an enterprise-level disaster.
I firmly believe that the new luxury in software development will no longer be the absolute speed of feature launches. The new luxury will be old-fashioned, boring certainty.
The Dual-Track Strategy
We cannot ban generative AI; its ability to innovate rapidly and test the market is too valuable. However, we absolutely cannot allow probabilistic vibe coding to dictate the architecture of our production systems.
To address this issue, CIOs can implement a “dual-track” development lifecycle, which separates rapid exploration from rigorous production engineering.
Track 1 (Fast Lane)
This is the realm of unrestrained exploration, where vibe coding is explicitly allowed and strongly encouraged. If a product manager wants to use autonomous agents to build a prototype in the afternoon, let them do so. The core metric here is feedback speed; we want to validate business ideas and test user interfaces as cheaply and quickly as possible.
But there is a massive caveat: development in Track 1 must occur in a highly isolated sandbox environment. These vibe-coded applications are one-off blueprints and are never allowed to touch production data, customer personally identifiable information (PII), or critical enterprise networks.
Track 2 (Slow Lane)
Once a prototype in Track 1 proves its commercial value, the project moves to Track 2, which is the domain of true software engineering.
The task here is simple but painful: start over. Do not attempt to refactor, salvage, or clean up vibe code; rewrite it from scratch.
In Track 2, human engineers take the lead, using the Track 1 prototype only as a visual reference. They build secure and scalable architectures that prioritize deterministic safety guarantees, strict type safety, and rigorous human peer reviews. AI tools are still used, but they are downgraded from autonomous creators to highly constrained assistants. Each dependency is validated against established security frameworks, and every unit test is manually reviewed to ensure we do not incorporate cardboard muffins into the core product.
A Significant Cultural Shift
Implementing a dual-track strategy requires a significant cultural shift, particularly in managing executive expectations, which hinges on an non-negotiable directive: never set the timeline for Track 2 based on the speed of Track 1.
Having this conversation with business stakeholders will be challenging. When they see a seemingly fully functional vibe coding prototype built over a weekend, they naturally assume that with just another week, the final product can be completed. However, strictly enforcing this boundary is how we ensure the enterprise becomes a beneficiary of AI programming rather than its next victim.
AI is a powerful enabler of innovation, but it cannot replace architectural vision. By adopting a dual-track strategy, we can allow teams to experiment freely at the speed of thought while safeguarding the deterministic rigor necessary for our digital infrastructure to operate.
Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.