Code Ownership Decay: What Happens to Team Knowledge When AI Writes Most Commits
When a bug surfaces in production, the first ritual is the same: open git blame, find who wrote the line, ask them why. That ritual assumes the author had a reason — a constraint they knew, an edge case they handled deliberately, a business rule they'd internalized from three quarters of postmortems. For most of software history, git blame answered a question about intent.
Now, for a growing share of commits, git blame points to a human who merged the code and an AI that generated it. The human may have spent 90 seconds reading the diff. The AI had no context beyond the prompt. The "why" — the institutional knowledge that made git blame useful — was never written down anywhere.
This is code ownership decay. It doesn't announce itself. No single commit breaks the system. Instead, understanding slowly hollows out until the team reaches a decision point — a refactor, an incident, a new hire ramping up — and discovers that nobody can explain the system from the inside anymore.
What Git Blame Actually Measured
Git blame was always a proxy. The real question wasn't "who wrote this?" — it was "who knows why this exists?" Those two questions had a reliable correlation when the person who wrote a line was also the person who reasoned about it, debated alternatives, and remembered the context.
That correlation is breaking. Refactored lines of code dropped from 24% of all changes in 2020 to under 10% by 2024. Code churn (lines revised within two weeks of writing) nearly doubled over the same period. The pattern suggests developers are iterating with AI in rapid generation cycles — generating, discarding, generating again — without developing understanding that would allow deliberate improvement later.
When you run git blame on AI-heavy code, you see the merge author. What you don't see is that the merge author may have reviewed 800 lines of generated output in 15 minutes, approved it because the tests passed, and retained essentially no model of what the code actually does. The attribution is accurate; the implied knowledge transfer never happened.
The tools trying to fix this — systems that track which model, which agent, which session produced a given line — address the attribution gap but not the understanding gap. Knowing that Claude 3.7 generated line 847 during a 45-minute agentic session tells you nothing about the invariant the code was meant to preserve or the alternative approaches that were considered and rejected.
PR Review as a Social Fiction
The pull request review process depends on a social contract: the reviewer reads the code, understands it well enough to evaluate it, and either approves or requests changes. That contract breaks in a specific way when AI generates the code.
Developers know when code came from an AI. Reviewers know it too. And in that shared knowledge, a new dynamic emerges: the reviewer doesn't ask clarifying questions, not because the code is obviously correct, but because asking would admit that they can't follow it — which implies the code is too complex, which implies the developer generated something they didn't understand, which nobody wants to surface. Both parties have an incentive to treat review as a formality.
The defect data makes this visible. AI-coauthored pull requests contain roughly 1.7 times more issues than human-written code. Critical issues in AI-involved PRs rose 40% compared to baselines. Review time increased 91% — not because reviewers were reading more carefully, but because unfamiliar patterns take longer to scan even when you're not deeply evaluating them. Reviewers spend more time and understand less.
This creates a specific failure mode: bugs that would have been caught by someone who understood the code reach production, cause an incident, and when you do the post-mortem, you discover that nobody in the review chain can explain what the code was supposed to do. The review happened. The understanding didn't.
Institutional Knowledge Doesn't Survive the Prompt
Every team carries knowledge that isn't in the codebase. The concurrency bug that happened two years ago and the fix that looks wrong but solves a race condition only reproducible at 3 AM. The database index that seems redundant but exists because a reporting query was killing prod every Sunday. The API client that retries three times instead of five because the upstream vendor rate-limits aggressively and once throttled them into a 45-minute outage.
This knowledge lives in the heads of engineers who were there. It surfaces in code reviews as inexplicable opinions ("we don't do it that way here"), in incident postmortems, in architecture discussions where someone says "we tried this before." It's what separates a team that maintains a system from a team that has code.
AI assistants don't carry this knowledge. They carry training data. When an engineer prompts for a solution, the model proposes code that is syntactically correct, probably handles the common case, and is entirely ignorant of the team's specific failure history. The engineer who has been there five years would have spotted the problem immediately. But if that engineer is reviewing 200 AI-generated PRs a sprint instead of writing code themselves, they're skimming, not reading — and the institutional knowledge never gets applied.
Anthropic's own research found that participants using AI assistance completed tasks in the same time as a control group but scored 17% lower on comprehension assessments afterward. The largest declines were in debugging — which is exactly the skill you need when something goes wrong at 2 AM and you're tracing through code nobody fully understood when it was written.
Senior engineers who define standards, catch architectural drift, and teach through code review become the sole distribution mechanism for institutional knowledge in AI-heavy teams. They were already a bottleneck. Now they're the only bottleneck that matters.
The Accumulation Problem
Individual AI-generated commits aren't the issue. The issue is accumulation over 12 months across a team of eight.
Refactoring dropped from 25% to under 10% of all code changes between 2021 and 2024. Code duplication quadrupled. Lines of code increase; comprehension per engineer decreases. What you get is a codebase that grows faster than anyone can understand it, with increasing concentrations of code that nobody has deeply read.
This creates a specific risk for new hire onboarding. Traditionally, reading the git history of a system is one of the best ways to understand how it evolved — what problems were solved, what tradeoffs were made, how the team thinks. An AI-heavy git history doesn't carry that signal. It shows a stream of generation events punctuated by test passes. The history exists; the reasoning embedded in it doesn't.
The same risk surfaces in incidents. Effective root cause analysis depends on someone understanding the code well enough to reason about what it was supposed to do and where it deviated. When "nobody wrote this in the sense of reasoning through it" is the honest answer to "who understands this component," incident resolution becomes exploratory rather than diagnostic.
What Actually Works
Three practices are emerging in teams that are managing this problem rather than discovering it after an outage.
Architecture Decision Records as AI governance context. ADRs were already useful before AI coding assistants. They're becoming essential now. When your ADRs live in the repository, they're not just documentation — they're governance context that can be fed directly into AI coding tools. This creates a feedback loop: engineers write down the "why" of architectural decisions, AI tools use those decisions to generate code that respects architectural intent, and the gap between generated code and team standards narrows. The teams that have invested in ADRs before AI adoption are seeing their value compound; the teams that skipped them are watching AI generate technically correct code that violates constraints nobody wrote down.
Deliberate review rituals separate from approval rituals. Some teams are separating the two functions that code review had combined: verification (does this meet the bar to merge?) and understanding (do I know what this does?). Fast AI-generated code passes verification quickly. Understanding requires time that review timelines don't protect. Teams that run explicit weekly sessions where engineers read AI-generated code with the goal of building understanding — not making a merge decision — are maintaining comprehension where teams that only do approval-gate review are not.
Attribution that records intent, not just authorship. The Co-Authored-By convention records that an AI contributed. What it doesn't record is the prompt, the architectural decision being implemented, or the alternatives considered. Teams are experimenting with expanded commit messages and PR descriptions that preserve this context — not as bureaucracy but as the institutional knowledge artifact that makes the commit useful to read six months later. This is high-friction per commit and low-friction at the system level: a small investment per change that pays off when something breaks and you need to understand it.
Code Ownership Isn't Authorship Anymore
Code ownership was always about accountability, not authorship. The question was never "who typed these lines?" — it was "who is responsible for knowing what this does and keeping it sound?"
That accountability is being diffused without being explicitly reassigned. When AI generates code, the developer who prompted it is responsible. When it's merged after a review, the reviewer shares responsibility. When nobody in that chain deeply understood the code, the ownership is nominal — it exists on paper and not in any engineer's head.
Teams that are navigating this well have made the accountability reassignment explicit: the engineer who uses AI to generate code owns that code in full, the same as if they'd typed every character. That means reading it carefully enough to explain it, defend it in review, and debug it when it breaks. AI is a tool that produces a first draft, not a co-author who shares the cognitive burden.
That's a harder standard than it sounds when AI is generating thousands of lines per sprint. It means slowing down generation to match understanding — which partly defeats the throughput advantage AI provides. The teams working through this are finding a calibration: AI is fastest at generating code in well-understood domains with clear specifications and strong test coverage. In those domains, understanding can be fast-tracked. In novel, critical, or complex code, the review overhead has to be budgeted honestly, not assumed to be negligible.
The teams that don't make this calibration will build fast and understand slowly, accumulating a codebase where nobody can answer the most important question git blame was always supposed to answer: why.
- https://pullflow.com/blog/the-new-git-blame/
- https://pullflow.com/blog/ai-institutional-knowledge-code-collaboration
- https://addyo.substack.com/p/code-review-in-the-age-of-ai
- https://www.oreilly.com/radar/comprehension-debt-the-hidden-cost-of-ai-generated-code/
- https://www.anthropic.com/research/AI-assistance-coding-skills
- https://arxiv.org/html/2512.05239v1
- https://arxiv.org/html/2511.02475
- https://blog.mozilla.ai/owning-code-in-the-age-of-ai/
- https://www.oreilly.com/radar/ai-is-writing-our-code-faster-than-we-can-verify-it/
- https://www.equalexperts.com/blog/our-thinking/accelerating-architectural-decision-records-adrs-with-generative-ai/
- https://stackoverflow.blog/2026/01/23/ai-can-10x-developers-in-creating-tech-debt/
