Testing GAINS Framework to Measure AI Maturity - Early Results Are Surprising

Three months ago our CFO dropped a challenge: Show me that our AI investments are working or I am cutting the budget.

Our existing metrics were not cutting it. We could show adoption rates but we could not connect AI usage to business outcomes.

So we found GAINS: Generative AI Impact Net Score.

What Is GAINS

GAINS measures AI maturity across organizations. It attempts to: measure AI adoption, identify organizational friction, and connect usage to outcomes.

The framework assigns a score from 0-100 representing organizational AI maturity not just tool adoption.

The 90-Day Pilot

We ran a pilot with our 40-person engineering team tracking adoption patterns productivity metrics and friction points over 12 weeks.

Surprising Findings

High Usage Does Not Equal High Impact

Developers using Copilot heavily but velocity not improving. Why? Code review was the bottleneck. AI-generated code sat in PR queues for 3-5 days.

The fix: expanded review capacity and implemented async rotations. Suddenly productivity gains flowed through.

Organizational Friction Is Invisible

GAINS friction index revealed: 38 percent unclear when to use AI, 31 percent AI conflicts with coding standards, 28 percent review process inadequate, 24 percent lack of training.

These were process and culture problems not technical problems.

Value in Unexpected Places

Biggest gains: Onboarding 40 percent faster, maintenance 55 percent faster, documentation actually happening.

The Business Case

GAINS Score: 64/100 (baseline 38/100). Adoption improved 45 percent to 78 percent. Friction decreased 60 percent.

Business Impact: Cycle time reduced 22 percent, onboarding 40 percent faster, maintenance velocity up 55 percent, developer satisfaction up 18 points.

CFO Translation: $400K ARR impact, $120K hiring efficiency, $200K incident prevention, $300K retention savings.

Total: $1.8M value from $380K investment.

CFO approved continued investment.

Challenges

Measurement overhead, attribution complexity, survey fatigue, scoring calibration.

But GAINS gave us structured ROI conversations that ad-hoc metrics could not.

Key Insight

AI productivity is organizational capability not just tool adoption.

GAINS forced us to look at the whole system: tools plus processes plus culture plus skills.

Question

Is anyone else experimenting with structured AI measurement frameworks? GAINS or similar maturity models? Custom frameworks?

We are all figuring this out together. The pressure from CFOs is not going away. Having a structured approach—even imperfect—is better than no approach.

Luis this is fascinating. The organizational friction piece resonates deeply with challenges I have been observing.

Why GAINS Matters

What excites me about this framework is that it measures organizational readiness not just tool adoption.

We have rolled out three different AI tools: GitHub Copilot (high adoption mixed impact), Claude for documentation (low adoption minimal impact), AI-powered code review (medium adoption significant impact).

Looking back the organizational context mattered more than the tool quality.

Why Copilot had mixed impact: We gave people the tool without changing our code review process. Developers generated code faster but hit the same bottlenecks you described.

Why Claude for docs failed: Our culture did not value documentation. Giving people better tools to do something they did not want to do did not help.

Why AI code review worked: We combined the tool with process changes and training. We made code review faster AND better which aligned with what the team valued.

The Friction Index Is Gold

Your finding that 38 percent of developers were unclear when to use AI versus write manually—that is exactly what I am seeing.

Our retrospectives reveal similar themes: I do not know which tasks are good fits for AI. I am afraid AI suggestions will break our coding standards. I feel like I am cheating when I use AI.

These are cultural and educational barriers not technical ones.

Our Parallel Approach

We have not used GAINS specifically but we have been tracking something similar: Adoption times Effectiveness equals Realized Value.

You can have 100 percent adoption but if effectiveness is low (due to friction process misalignment skill gaps) realized value is still low.

Both dimensions matter.

Question About Organizational Friction

You mentioned identifying friction through weekly surveys. Can you share more about how you measured friction specifically?

What were the questions? How did you quantify friction? How did you connect friction points to specific interventions?

This is the piece I struggle with most. I can see friction anecdotally in retros and 1-on-1s but I have not figured out how to measure it systematically.

The Broader Point

I think frameworks like GAINS are important not because they give us perfect measurement but because they force us to think systemically.

The question is not are developers using AI tools. The question is is our organization capable of extracting value from AI tools.

That encompasses tool selection training and enablement process adaptation cultural readiness and measurement and iteration.

Next Step I Am Considering

Based on your post I am thinking about running a similar pilot with one team.

Start with a 30-day lightweight version: baseline measurement, weekly friction survey (3 questions), process intervention based on friction findings, re-measure after 30 days.

See if we can quickly identify and remove friction points blocking AI value.

Thanks for sharing this Luis. It has given me a concrete framework to experiment with.

Luis this is exactly the kind of structured thinking we need in the AI ROI conversation.

Why This Matters at the Executive Level

From a CTO perspective what you are describing solves a critical problem: giving engineering leaders a standardized language for board-level AI discussions.

Right now every engineering leader is making up their own metrics. Some track adoption some track time saved some track business outcomes. There is no common framework.

This creates problems: hard to compare AI maturity across teams or companies, hard to know if we are behind or ahead of market, hard to set realistic targets for AI investment returns.

GAINS (or frameworks like it) could become the DORA metrics for AI maturity. That standardization would be hugely valuable.

Scaling Question

My immediate question: Does this scale?

You ran this with a 40-person team. We have 120 plus engineers across four product teams. Running a 90-day structured pilot with weekly surveys across all teams would be significant overhead.

Have you thought about how to scale GAINS measurement? Is there a lite version for larger organizations? Can it be partially automated?

The Process Bottleneck Insight

Your finding about code review being the bottleneck is exactly what I have been preaching internally.

Technology enables potential. Process enables realization.

I see this pattern everywhere: AI generates code faster but manual testing slows deployment. AI catches bugs earlier but incident response process does not leverage AI insights. AI documents code but documentation review process does not exist.

We have been solving these one by one but there is no systematic approach. GAINS seems to provide that systematic lens.

Measurement Burden vs Insight Value

My concern (echoing Maya point from the other thread): what is the overhead cost of running GAINS?

You mentioned daily adoption tracking weekly friction surveys biweekly outcome measurement and 90-day time commitment.

For a 40-person team what was the total person-hours invested in measurement? How much time did engineers spend on surveys and data collection?

I need to understand if the juice is worth the squeeze before rolling this out broadly.

The Executive Dashboard

If you were to build an executive dashboard for board-level AI reporting using GAINS what would you include?

I am thinking: GAINS Score over time (are we improving?), Adoption versus Effectiveness gap (where are we blocked?), Top friction points (what interventions are needed?), Business impact metrics (what value are we creating?).

Is that roughly the structure you would recommend?

The Standardization Opportunity

Here is what excites me most: if GAINS (or something like it) becomes a standard framework suddenly CTOs can benchmark AI maturity against industry, set targets based on best-in-class examples, measure progress objectively, and justify investments with standard metrics.

That is the DORA metrics playbook. It worked for DevOps transformation. It could work for AI transformation.

Question on Instrumentation

How much of GAINS measurement can be automated versus requires manual surveys?

Ideally: adoption metrics automated (tool usage logs), outcome metrics automated (cycle time defect rates), friction identification some surveys required, process intervention manual.

Is that roughly accurate? The more we can automate the less overhead burden on teams.

The Bottom Line

This could be the standardized measurement framework CTOs need for AI ROI conversations.

If you are willing to share your implementation playbook (what you measured how you measured it what interventions worked) I would be very interested. This could be valuable for the broader community.

Luis have you thought about writing this up more formally? A detailed case study of GAINS implementation with before-after metrics would be incredibly valuable. I would share it with my CTO network in a heartbeat.

Luis this is great. I am particularly interested in whether GAINS could extend beyond engineering to other functions.

The Cross-Functional Opportunity

Right now AI measurement is fragmented across our organization: Engineering tracking code generation, Product tracking experiment velocity, Marketing tracking content generation, Sales tracking email effectiveness, Support tracking chatbot resolution rates.

Every function measures differently. We have no common framework.

What if we applied GAINS principles company-wide? Measure AI adoption across all functions, organizational friction, and business impact.

This would give us a company-wide AI maturity score that our board could track quarter over quarter.

Portfolio Management Approach

From a product perspective standardized AI measurement enables portfolio management.

Right now we are making AI investment decisions function by function without visibility into relative ROI.

Should we buy engineering AI tools or marketing AI tools? Should we invest in adoption or in new capabilities? Where do we have high adoption but low impact (friction to fix)? Where do we have low adoption but high impact potential (training needed)?

Common measurement framework equals better capital allocation.

The Challenge of Non-Engineering Functions

Engineering has relatively clear metrics: cycle time defect rates deployment frequency.

Marketing and sales metrics are fuzzier. How do you measure AI-assisted creativity in marketing? How do you attribute revenue impact to AI-assisted sales emails versus other factors?

Have you thought about how GAINS adapts to functions with less quantifiable work?

The Business Case for Standardization

If we could show our CFO a single dashboard with Company-Wide AI Maturity Score 67/100: Engineering 72/100, Product 64/100, Marketing 58/100, Sales 51/100.

And then show friction points by function, ROI by function, and adoption gaps.

That would be a strategic-level view of AI that does not exist today.

Question on Adaptation

Luis have you thought about adapting GAINS for product or other non-engineering functions?

What would need to change? Would the core framework (adoption plus friction plus outcomes) still apply?

I could see this being really powerful if we could get a consistent measurement approach across the organization.

The Strategic Insight

The best frameworks are the ones that enable better decision-making not just better reporting.

GAINS seems like it could help us answer strategic questions: Where should we invest next? What friction should we prioritize fixing? Which teams are ready for advanced AI capabilities versus need foundational support?

That is portfolio management thinking applied to AI transformation.

Really appreciate you sharing this Luis. I am going to explore whether we can pilot something similar in product management.