Why Your AI Roadmap Shouldn't Have a 12-Month Plan
A team I worked with last quarter spent six weeks building a "smart document classifier" — fine-tuned model, eval harness, custom UI, the whole production pipeline. It shipped on a Tuesday. The following Monday, a new general-purpose model dropped that beat their fine-tune on the same eval, zero-shot, with no infrastructure investment. Their entire Q2 OKR became a wrapper around a one-line API call. The roadmap had committed twelve months earlier to "owning the classification stack." That commitment was wrong before the ink dried.
This is not an isolated story. Industry trackers logged 255 model releases from major labs in Q1 2026 alone, with roughly three meaningful frontier launches per week through March. Costs have collapsed: API pricing is down 97% since GPT-3, and the gap between top providers has narrowed to within statistical noise on most benchmarks. When the underlying substrate changes this fast, a twelve-month feature roadmap is not a plan — it is a list of bets you cannot revisit, made with information that will be stale before you ship the second item.
The temptation is to read this and conclude "stop planning." That is the wrong lesson. The right lesson is that the unit of planning has to change. A roadmap that promises features by date assumes the cost and capability of the building blocks are stable. They are not. What you actually have is a portfolio of capability bets, each with a validation clock and a kill condition. Treating that portfolio like a feature roadmap is a category error, and it shows up as the same three failure modes in team after team.
The Three Ways Long Roadmaps Fail in AI
The first failure is the moat-free feature. You commit to building something that, by the time you ship it, the platform vendor will offer for free as a default. PDF extraction, basic summarization, simple classification, transcription, embedding-based search — every one of these was a defensible product surface in 2023 and is now a checkbox on a model provider's pricing page. If your eighteen-month roadmap has features that depend on the current capability gap, you are racing a curve that is bending against you. The longer your roadmap, the more of it sits below the rising waterline.
The second failure is the wrong-substrate bet. You commit to a technical approach — fine-tuning, a particular agent framework, a specific embedding model, a custom RAG pipeline — and six months later a new model architecture or capability obsoletes the entire stack. Teams that built elaborate retrieval pipelines in 2024 watched long-context models compress half their work into a system prompt. Teams that built complex tool-orchestration layers watched native tool-use capabilities subsume their custom routing. The bet was not on the feature; it was on the substrate, and the substrate moved.
The third failure is the frozen hypothesis. You committed to solving a user problem in a specific way, but you also locked in your assumptions about what users would tolerate, what latency they would accept, what UX shape the feature should take. Then real usage data arrives at month four and contradicts every assumption. In a normal product, you would refactor the plan. In an "approved twelve-month roadmap," you finish what you committed to, ship a thing nobody asked for in the form they wanted, and call it execution. More than 60% of AI roadmaps written today are functionally obsolete within nine months — and the org's response is usually to ship them anyway, because the alternative requires admitting the plan was wrong.
What Replaces the Roadmap: A Portfolio of Capability Bets
If features-by-date is the wrong unit, what is the right one? The pattern that holds up is treating AI work as a portfolio of small, time-boxed capability bets, each one structured so you can tell quickly whether it is working and kill it cleanly if it is not.
A capability bet is not "build feature X by Q3." It is a hypothesis: we believe that capability Y, applied to user segment Z, produces measurable outcome W. It has three things a roadmap item does not: a falsifiable claim, a fixed time-box (usually 4–8 weeks), and an explicit kill condition that says when you stop. The portfolio metaphor matters because no single bet has to work — the system is designed for some to fail, and the failure of any one bet is a signal, not a setback. This is closer to how research labs run than how product teams traditionally plan, and that is the point: the work has more in common with applied research than with shipping a known-good feature.
Each bet should answer four questions before it starts. What capability are we testing? (Specific: "long-context reasoning over 200k-token contracts," not "AI for legal.") What outcome would prove it works? (Quantitative: "≥80% extraction accuracy on the held-out test set, with median latency under 8s," not "users like it.") What is the budget? (Both calendar time and dollars — say, six engineering weeks plus $20k of inference credit.) What kills it? (A pre-committed threshold: "if accuracy stays below 65% after three iteration cycles, we shut it down regardless of how much we have spent.") Without the kill condition, every bet drifts into a small cult of sunk cost.
- https://vmalyi.com/blog/ai-strategy-part-1-why-3-year-roadmap-obsolete/
- https://www.metacto.com/blogs/ai-roadmap-problem-visioning-execution
- https://www.digitalapplied.com/blog/frontier-model-release-velocity-index-q2-2026
- https://hbr.org/2026/01/manage-your-ai-investments-like-a-portfolio
- https://medium.com/@cenrunzhe/ai-killed-the-feature-moat-heres-what-actually-defends-your-saas-company-in-2026-9a5d3d20973b
- https://philippdubach.com/posts/is-ai-really-eating-the-world-1/2/
- https://loopjar.ai/blog/why-traditional-product-roadmaps-are-dead
- https://medium.com/@rob.w.automation/roadmaps-to-nowhere-why-your-ai-plan-isnt-a-strategy-0b451c615d86
