It was the most technically impressive piece of software I have ever shipped. It was also the biggest business failure of my career.
In 2025, the pressure to “do AI” is suffocating. Boards are demanding it, competitors are shipping it and product managers are terrified of being left behind.
I was one of them. I didn't want to just keep up; I wanted to win. So, I led a team to build the ultimate AI search tool. We had the best engineers, the best data and the best intentions.
And when we launched, we hit the market with a resounding thud.
This isn’t a theoretical article about AI strategy. This is a post-mortem of my own mistake. It’s a look at how a smart team can get seduced by technology and forget about the only thing that matters: the business model.
Why Do Great AI Tools Fail?
Even technically perfect AI tools fail when they prioritize “magic” over business viability. To ensure an AI feature succeeds, it must pass three critical tests.
- The Value Test: Does the AI actually remove labor, or does it just create “homework” like prompting and editing for the user?
- The Margin Test: Can the business afford the unit economics? High LLM token costs combined with flat-rate subscriptions can lead to bleeding cash on power users.
- The Retention Test: Is it a painkiller or just a vitamin? A successful tool is one the customer literally cannot do their job without.
The Seductive Appeal of Looking at Magic
We started with a mandate that sounded logical: “Unlock the value of our proprietary data.”
For years, our customers had been dumping documents, notes and logs into our platform. Searching that data was a nightmare, based on old-school keyword matching that failed half the time.
So, we decided to fix it with the sledgehammer of the moment: generative AI.
The engineering team was electric. We spun up a modern RAG (retrieval augmented generation) pipeline. We used vector databases. We integrated the latest LLMs.
I remember the demo meeting vividly. I typed a complex, natural-language question into the search bar. The spinner whirred for a second, and then, boom. It didn't just find the document; it summarized the answer perfectly.
It felt like magic. We ran the numbers to prove it wasn’t just a feeling. We used nDCG (Normalized Discounted Cumulative Gain) scores to measure relevance:
- Legacy Search: 0.65 (Barely functional)
- New AI Engine: 0.92 (Near perfection)
We high-fived. We thought we had built a moat. In reality, we had just built a very expensive toy.
3 Hard Lessons We Learned
We shipped it. We waited for the usage graph to go up and to the right. Instead, it flatlined.
We failed because we fell in love with the mechanism (the AI) instead of the outcome (the value). Here is exactly where we went wrong.
1. The ‘Wrapper’ Fallacy (Or, Why Users Are Lazy)
We essentially built a wrapper around a database. We thought users would be thrilled to “chat” with their data.
We were wrong.
I sat behind the glass during a user research session post-launch, and what I saw was painful. To use our tool, the user had to:
- Stop what they were doing in their main form.
- Open our AI sidebar.
- Type a prompt.
- Wait.
- Copy the answer and paste it back into their work.
We thought we were giving them a superpower. They felt like we were giving them homework.
The Human Truth
Users don’t want to search. They want to be done. By forcing them to prompt the AI, we increased their cognitive load. We built a destination when we should have built a utility that worked silently in the background.
2. The COGS Nightmare (Or, The Math of Ruin)
This was the moment that actually kept me up at night.
Because we were obsessed with that 0.92 accuracy score, we used the most powerful, expensive models available. We didn’t worry about cost; we worried about quality.
Then I saw the bill.
I opened a spreadsheet and modeled out our unit economics, and my stomach dropped.
- The Cost: Between the vector compute and the LLM tokens, a single complex query cost us about $0.08.
- The Price: We charged a flat subscription of $29/user/month.
That $0.08 sounds like pennies until you do the math on a power user. If a customer actually loved our product and used it just 15 times a day, we weren’t making money. Instead, we were bleeding cash.
We had effectively built a business model where we paid our best customers to bankrupt us. We had built a Ferrari to deliver pizzas, and we were charging for the pizza, not the car.
3. The ‘Vitamin’ Problem
Finally, there was the “Who cares?” test.
We built a copilot. But in 2025, copilots are mostly vitamins. They’re nice to have. They look cool in a sales demo. But when our AI feature went down for maintenance one afternoon, nobody called support.
That silence was the loudest feedback we could have received.
We hadn't built a painkiller, something that stops the business from functioning if it breaks. We had built a novelty.
The Fix: The Product P&L Test
I’m sharing this failure so you don’t have to repeat it. Before you let your team spend six months building a generative AI feature, force yourself to answer these three questions. I call it the Product P&L Test.
1. The Value Test: Did We Remove Labor?
Don’t ask if the AI is smart. Ask if it lets the user go home early.
- The Trap: The AI writes a draft that the user has to spend 10 minutes editing. You just shifted the labor instead of reducing it.
- The Win: The AI automates the task completely, with zero human intervention.
2. The Margin Test: Can We Afford to Win?
Never bundle unlimited AI compute into a flat-rate subscription. You are exposing yourself to unlimited downside risk.
- The Trap: Unlimited AI Access for $29/mo.
- The Win: Usage-based pricing (credits) or strict fair-use caps that protect your margins.
3. The Retention Test: (Is the Product a Painkiller?
This is the brutal one. If you turned this feature off tomorrow, would your customer churn?
- The Trap: “”I guess I'll just do it the old way.”
- The Win: “I literally cannot do my job without this.”
Build Products That Solve Problems
In the current economy, capital is expensive. The era of growth at all costs is dead.
As product leaders, we must stop being starry-eyed about technical possibility. We need to become ruthless guardians of business viability.
Don’t build an AI wrapper just because you have the data. Build for margin, build for automation or don’t build at all. Trust me, it’s much better to kill a feature on a whiteboard than to kill it after you’ve already launched.
