Vibe Coding vs. Real Engineering

A teardown of the five ways vibe-coded apps fail in production, with an audit scorecard to check whether your AI-built prototype is actually ready for real users.

A founder came to us last quarter with a SaaS he had built using Cursor over a long weekend. Booking platform for service businesses. He had a working demo, three paying customers, and a pitch deck with screenshots of the live app. From the outside, it looked like a real product.

Then his fourth customer signed up and tried to book an appointment that overlapped with an existing one. The system accepted both bookings. He tried to fix it, prompted the AI to “add conflict detection,” and the patch broke the calendar view for every existing user. His three paying customers saw blank screens for six hours before he noticed.

When we audited the codebase, we found no error logging, no input validation on the booking API, authentication tokens that never expired, and business logic scattered across 40 React components with no separation from the UI. The cost to stabilise it was $6,200. Building it properly from the start would have been $4,000.

That is not one story. It is a pattern I see two or three times a month from founders who ship vibe-coded prototypes as if they were finished products.

What vibe coding actually is

The term was coined by Andrej Karpathy in early 2025 and it stuck because it captures something real. You are not writing code. You are directing an AI to write it for you. Tools like Cursor, Bolt, v0, and Replit Agent let you build functional-looking apps from a prompt in under an hour.

For demos, internal tools, and early-stage validation, it is brilliant. You can test an idea before committing serious budget. That is a real advantage.

When vibe coding makes sense

This is not a take-down of AI tools. There are situations where vibe coding is genuinely the right approach:

Validating a concept before you spend money. You have an idea for a marketplace, a scheduling tool, a dashboard. You want to see if the core interaction works before you commit $4,000–$8,000 to a proper build. Vibe coding gets you a clickable prototype in a weekend. That is worth doing.
Internal tools that only your team uses. An admin panel, a reporting dashboard, a script that pulls data from three APIs and formats it into a spreadsheet. If the only users are your own team and the stakes of downtime are low, AI-generated code is often good enough.
Learning how software works. If you are a non-technical founder trying to understand what a database does, how APIs connect, or what authentication involves, prompting an AI and watching the code appear is one of the fastest ways to build intuition.
Throwaway prototypes for investor conversations. A demo you show on a call and never deploy. No real users, no real data, no real risk.

In all of these cases, the code does not need to be production-grade because it is not going to production. The problems start when it does.

The problem starts when the prototype looks done and founders treat it like a finished product. Here is the core of it:

Vibe coding externalises the typing. It does not externalise the thinking.

When you prompt an AI to “build a user dashboard with billing,” it builds exactly that. It does not ask: what happens at scale? What if a user submits malformed data? How does this interact with your existing payment records? What is the rollback plan if something fails mid-transaction?

A senior engineer asks those questions before writing a single line of code. The AI does not — unless you already know what to ask, which means you are already doing the engineering work.

Five ways vibe-coded apps break in production

1. No real error handling

AI-generated code assumes the happy path. A user submits a form — it works. A payment goes through — it works. But what happens when the payment gateway times out? When the user submits an empty field? When an external API returns a 500 error?

Vibe-coded apps typically have no structured error handling. Errors either crash the app silently, show raw technical messages to users, or fail without any feedback at all.

The business consequence: Your users see a blank screen and leave. You have no logs telling you what went wrong. You cannot fix what you cannot see. For the booking platform founder above, six hours of downtime meant two of his three customers asked for refunds.

2. Authentication that looks right but is not secure

Auth is one of the most dangerous areas to get wrong. AI tools will scaffold a login system that appears to work — users can sign up, log in, and access their dashboard. But underneath you often find:

Login tokens that never expire (a stolen token works forever)
Password reset flows that do not invalidate the old token (the old link still works)
Access controls enforced only in the interface, not on the server (anyone with basic tools can access other users’ data)
Sessions that persist indefinitely

None of this is visible in a demo. It surfaces when someone probes the system — and that someone may not be friendly. A single auth vulnerability in a product handling customer data is a liability that can end a startup before it begins.

3. Architecture nobody can maintain

The AI writes code that works now, for the specific scenario you described in the prompt. It does not think about how this system will change in three months. It does not separate concerns or consider what happens when you add a second feature that interacts with the first.

The result is spaghetti: business logic tangled into UI components, database queries scattered across the codebase, no clear boundaries between different parts of the system. When you need to change something — add a feature, fix a bug, bring on a developer — you are untangling knots instead of building.

The business consequence: Every change takes three to five times longer than it should. The cost to add a feature to a tangled codebase is not the cost of the feature. It is the cost of the feature plus the cost of understanding and safely modifying everything it touches.

4. No test coverage

AI tools rarely write tests unless you specifically ask for them, and even then the tests are often shallow — they test that the function exists, not that it behaves correctly under different conditions. You end up with a codebase where every change is a gamble. Did that fix break something else? You genuinely do not know until a user reports it.

5. Data handling that creates legal exposure

GDPR, data retention policies, what gets logged and where — AI code generators do not think about these. A vibe-coded app might log full request payloads including passwords to a public console, store sensitive user data without encryption, or never delete user data when users request it.

For products targeting UK or EU markets, this is not a nice-to-have concern. It is a legal requirement with fines up to 4% of annual turnover.

The vibe code audit scorecard

If you have built something with AI tools and you are not sure whether it is production-ready, run through these seven questions. Any “no” is a red flag.

Error recovery. If the payment API times out mid-transaction, does your app handle it gracefully — no duplicate charges, clear user feedback?
Server-side access control. Are permissions enforced on the server, not just in the interface?
Code readability. Could a new developer understand your codebase and make a change within one working day?
Test coverage. Do you have automated tests for your three most critical user flows?
Data compliance. Do you know exactly what user data you store, where it is stored, and how to delete it on request?
Monitoring. When something fails, do you receive an alert with enough detail to diagnose the problem?
Security review. Has anyone besides the AI reviewed the code for security issues?

If you answered “no” to three or more: your app is a prototype, not a product. That does not mean you throw it away. It means you need engineering work before you put real users and real money through it.

If you answered “no” to question 2 or 5: stop and fix these before anything else. These are the issues that create security incidents and legal liability.

Where AI fits in real engineering

I am not anti-AI. We use Cursor, Copilot, and other tools on every project at Buldtech. AI is genuinely useful for:

Drafting boilerplate code and utility functions
Writing tests for code you have already structured
Explaining unfamiliar APIs or patterns
Speeding up repetitive tasks

The difference is that we review what the AI produces. We understand what it wrote. We catch the assumptions it made. We restructure where it cut corners. We test the edge cases it ignored.

That is the discipline that makes AI tools productive rather than dangerous.

What a proper MVP actually looks like

Good MVP development is not about writing everything from scratch. It is about making smart decisions about what to build, how to structure it, and where to cut corners intentionally — knowing you will return to fix them — versus cutting corners by accident because the AI did not flag the risk.

A production-ready MVP should have:

Basic but real error handling — users should never see a raw error; you should always know when something fails
Auth that is actually secure — even if it is simple (more on evaluating this)
A codebase a developer can read and change — this is what makes iteration fast, not the AI
Deployment that you control — your code, your servers, your data (why code ownership matters from day one)
A clear upgrade path — you know what has been deferred and why (how to scope this properly)

None of this requires months of work. A focused engineering team can build a solid MVP in 21 days. The difference between that and a vibe-coded app is not the timeline — it is whether the thing will still be standing when your tenth customer signs up.

The cost of fixing it later

Every founder who comes to us after a vibe-coded app breaks asks the same question: how much to fix it?

The answer is almost always more than building it right would have cost:

Scenario	Build-Right Cost	Fix-Later Cost	Why
Stabilise auth and error handling	Included in MVP build	$2,000–$4,000	Retrofitting into tangled code
Refactor spaghetti architecture	Included in MVP build	$3,000–$6,000	Understanding + restructuring
Full rewrite after production failure	$4,000–$8,000	$6,000–$12,000+	Rewriting under pressure, data migration

Rewriting from scratch is demoralising and expensive. Cleaning up spaghetti takes longer than writing clean code from the start. This is not because vibe coding is bad — it is because production software has different requirements than a working demo, and you need engineers who know the difference.

Your next step

If you have built something with AI tools and you are not sure whether it is ready for real users, start with the audit scorecard above. It will tell you where you stand.

If you want a second opinion, download the MVP Scope Clarity Checklist — it covers the structural requirements that separate a prototype from a product you can confidently put in front of customers and investors.

And if you want a fixed-price MVP build with a Day-10 demo, code ownership from day one, and a 21-day delivery timeline, book a free strategy call. No pitch, no pressure — just a straight conversation about your idea and what solid engineering would look like for it.

Vibe Coding vs. Real Engineering — Why Your AI-Built App Will Break in Production