Had a tough conversation with our CFO last week. We came in with a proposal to expand our AI coding assistant licenses across the entire engineering org—120 seats, roughly $180K annually. The engineering team was enthusiastic. We’d been piloting with 30 engineers for six months, and the feedback was positive. Developers liked the tools.
The CFO said no.
Not “maybe later” or “let’s revisit next quarter.” Just no. Her reasoning: “Show me the business impact first. I need to see this affecting our P&L before we scale it.”
That hit different. Because honestly? We couldn’t show her the impact. We had sentiment surveys showing developers were happy. We had anecdotal stories about faster code completion. But when she asked about delivery velocity, cycle time improvement, or revenue impact from features shipped faster… we had nothing concrete.
The reality check I wasn’t ready for
Turns out we’re not alone. I’ve been reading that only 14% of finance chiefs say they’ve seen clear, measurable impact from AI investments. Even more sobering: 95% of generative AI pilots fail to deliver tangible P&L results, according to MIT’s 2025 AI Report.
And now CFOs are deferring 25% of AI spending into 2027. The era of “show me you’re experimenting” is over. It’s “show me measurable impact, this year.”
Did we overhype, or did we just skip validation?
Here’s what bothers me most: I genuinely believe AI coding tools can improve productivity. The research shows engineering teams achieving 39% better R&D efficiency. The technology works.
But I’m questioning our approach. Did we:
- Rush into deployment because everyone else was doing it?
- Mistake developer satisfaction for business value?
- Treat AI tools like free experiments instead of capital investments?
- Skip the instrumentation needed to measure actual impact?
Looking back, we deployed these tools the same way companies rolled out collaboration software in 2005: install and pray. We didn’t establish baseline metrics. We didn’t define success criteria. We didn’t instrument our delivery pipeline to measure before/after.
We just… turned the tools on and assumed value would materialize.
The uncomfortable question
Was early AI adoption strategic, or was it FOMO-driven?
I keep thinking about this. If our CFO had asked about ROI before we started the pilot, would we have designed it differently? Would we have:
- Measured baseline cycle times and velocity metrics first?
- Defined specific hypotheses about where AI would help?
- Created control groups to isolate the AI impact?
- Connected engineering metrics to customer value and revenue?
Probably yes. But we didn’t do any of that. We got caught up in the narrative that “AI is the future” and “we need to move fast or get left behind.”
And now we’re stuck. We have 30 engineers using tools they like, but we can’t prove those tools justify $180K—or even the $45K we’re currently spending.
What would actually prove ROI to a CFO?
I’m genuinely asking: What metrics would convince a skeptical CFO that AI tools are worth the investment?
The engineering metrics I care about—PR velocity, code quality, developer satisfaction—don’t translate to finance language. She needs to see:
- Revenue enabled by faster feature delivery?
- Costs avoided through efficiency gains?
- Customer retention improved by better product quality?
- Margin expansion from doing more with same headcount?
But connecting AI coding assistants to those outcomes requires instrumentation and attribution we don’t have. And building that measurement infrastructure might cost more than the tools themselves.
So where does that leave us? Do we:
- Shut down the pilot and admit we can’t justify it?
- Invest in measurement infrastructure before scaling?
- Accept that some innovations can’t be measured in traditional ROI terms?
- Find better proxies that connect engineering gains to business impact?
Right now, I’m leaning toward option 2. But I’m curious: How are other engineering leaders handling CFO scrutiny on AI investments? What validation approach actually works?
Because if 95% of AI pilots are failing, we need better playbooks. The “move fast and figure it out later” approach clearly isn’t working when finance is demanding proof.