Skip to main content

The Invisible Author Problem: Git Blame When AI Writes Most of Your Code

· 8 min read
Tian Pan
Software Engineer

When something breaks in production, the first thing engineers reach for is git blame. The commit hash links to a PR. The PR links to an author. The author links to context — a Slack thread, a design doc, a brain that remembers why the code was written that way. This chain is how teams debug incidents, conduct security audits, and accumulate institutional knowledge. It assumes that every line of code has a human author who understood what they were doing.

AI has quietly broken that assumption. Roughly 46% of code is now AI-generated, with Java shops pushing that figure past 60%. Most of that code carries no meaningful provenance metadata. The git blame chain still runs — it just now terminates at a developer who accepted a suggestion they may not have fully understood, with no record of the prompt, the model version, or the alternatives the AI rejected.

This is the invisible author problem: the scaffolding for accountability in software engineering was designed for human authors, and it's failing at scale.

The Accountability Chain Is Now a Dead End

Traditional git blame works because the named author possesses a mental model of the change. They remember the trade-off. They can explain why this approach was chosen over the alternative. They can answer the question that matters in a production incident: "what did you think this code was doing?"

When AI writes the code, that mental model is absent. The developer who committed it reviewed it — possibly cursorily, under time pressure — but didn't write it. Surveys find that 66% of developers report spending more time fixing AI-generated code than they saved writing it. A large study of enterprise codebases found that AI-generated code carries 2.5× more high-severity vulnerabilities than human-written code, and leaks secrets at double the baseline rate.

When those vulnerabilities surface, accountability is murky: 53% of engineers blame security teams, 45% blame the developer who accepted the suggestion, 42% blame whoever approved the PR. Everyone points elsewhere because no one has clear ownership.

A concrete example of how governance breaks down: one major IDE vendor shipped a version that automatically added Co-authored-by: attribution to commits even when AI assistance was disabled. The rollback came in the next release — but not before it surfaced just how ad-hoc the tooling is around AI authorship annotation. The field is running on convention and goodwill rather than protocol.

The Knowledge Drain Nobody Is Talking About

Accountability is only part of the problem. The deeper issue is knowledge erosion.

Engineers traditionally acquired system expertise by reading code — navigating legacy layers, puzzling out why an abstraction exists, understanding the shape of a previous engineer's thinking. This slow accumulation is how teams build the institutional knowledge that makes incident response fast. When you've read the auth middleware three times, you know where to look when tokens start failing.

AI-generated code short-circuits that loop. The codebase grows. The test suite passes. The feature ships. But no human holds a mental model of the system being built. Forty-three percent of AI-generated code changes need debugging in production. When those incidents occur, the engineers responsible are increasingly diagnosing failures in code that no one designed — code that emerged from a sequence of prompts and acceptances, with no human who thought through the edge cases.

This is a form of technical debt that doesn't appear in your linter or your type checker. It only surfaces at 2am when a senior engineer who "would have known" isn't available, and the team is staring at a stack trace in a module no one remembers writing.

The Review Bottleneck Is Getting Worse, Not Better

The natural response is to invest more heavily in code review. But the data suggests this is not going well.

AI accelerates code writing by approximately 55%, but PR review time has increased by 91% in AI-heavy teams. Senior engineers spend an average of 4.3 minutes reviewing AI-generated suggestions versus 1.2 minutes for human-written code. Agentic AI PRs — where the AI made a series of changes autonomously — have pickup times 5.3× longer than unassisted PRs. Engineers actively procrastinate on reviewing AI-generated code.

The cognitive dynamic is worth understanding. Reviewers trained to spot bugs in human code are looking for familiar failure signatures: the missing null check, the off-by-one, the edge case in the loop. AI code looks syntactically clean. It passes automated checks. It fits expected patterns. This creates template blindness: reviewers skim over surfaces that look correct without interrogating the logic underneath. AI doesn't make rookie mistakes, so reviewers stop looking for rookie mistakes — and then miss the non-obvious ones.

The result is that speed gains in generation evaporate in review, incidents per pull request have increased 23.5%, and senior engineers' time is being consumed by reviews they find unrewarding. This is not a sustainable allocation.

What Provenance Infrastructure Should Look Like

The industry is beginning to build tooling, though consensus is far from settled.

Two competing approaches have emerged in the last eighteen months:

Commit-level annotation tools like git-ai use Git Notes to attach attribution metadata to commits without modifying commit history. They track which lines were generated versus human-written, which model produced them, and carry that information through PR review into production. The approach is lightweight — no new files, no repo configuration required — but depends on agents accurately self-reporting what they authored.

Sidecar provenance files take a different approach. Instead of embedding metadata in Git infrastructure, they live as human-readable markdown files in a .provenance/ folder, one file per conceptual change. Each file contains the chat history, the rejected alternatives, the rationale for the approach taken. The goal is not just accountability but interpretability — future engineers can reconstruct the author's (or agent's) reasoning.

A more ambitious standard, backed by a consortium that includes Cursor, Cloudflare, Vercel, and Google, introduces JSON-based trace records that connect code ranges to their originating conversations, classifying contributions as human, AI, mixed, or unknown at line granularity.

The critical problem: these approaches are not interoperable. Teams that adopt one cannot read the output of another without translation. Until there's a standard, provenance metadata is siloed per team, per tool, and per repo — and cross-repository queries (which matter for supply chain security) remain impossible.

Team Practices That Work Now

While the tooling layer matures, engineering teams have developed operational patterns worth adopting immediately:

Bot sponsorship: any PR authored by an AI agent requires a named human who takes explicit responsibility for it — not just "merged by" but an annotated acknowledgment that they have read and understood the change. This doesn't solve the knowledge problem, but it closes the accountability gap.

Dual-path review: separate the machine-checkable from the human-judgment layer. AI handles first-pass verification — secrets detection, linting, basic vulnerability scanning. Humans focus review cycles on logic, architecture, and security design. This reduces the cognitive load on reviewers without reducing the quality of human scrutiny on decisions that matter.

Prompt journals: engineers who use AI heavily keep a record of the prompts and conversation context behind significant changes. Some teams are building this into their PR templates — a required field for any PR where AI wrote more than a threshold percentage of the diff. The overhead is low; the benefit in incident response is significant.

Risk-based tiering: not all AI-generated code carries the same accountability burden. Lock file updates and scaffolding code require minimal provenance. Authentication changes and data pipeline logic require full human review with documented reasoning. Codifying this tiering explicitly — rather than leaving it to reviewers' discretion — reduces both review fatigue and accountability gaps where it matters.

The Deeper Reckoning

The invisible author problem is not primarily a tooling problem. Tooling is part of the answer, but the core issue is that the software engineering profession is absorbing a structural change faster than its accountability norms can adapt.

For most of software history, the author of a piece of code also understood it. That's no longer a safe assumption. It means the practices built on top of that assumption — blame-driven debugging, knowledge transfer through code review, ownership models for security incidents — need explicit revisitation rather than quiet erosion.

Teams that treat AI as a faster typist will eventually face an incident where the speed gains were borrowed from the future. The institutional knowledge that git blame represents isn't just metadata — it's the accumulated understanding of why the system works. When AI writes most of the commits, that understanding has to be reconstructed deliberately, through provenance records and sponsorship norms and prompted journals, or it doesn't exist at all.

The question isn't whether to use AI to write code. At this point, that ship has sailed. The question is whether you're building the accountability infrastructure alongside the velocity, or just running up a tab you'll settle in production.

References:Let's stay in touch and Follow me for more thoughts and updates