Skip to main content

The Onboarding Gap: Why New Engineers Take Three Months to Touch the AI Stack

· 9 min read
Tian Pan
Software Engineer

A backend engineer with eight years of experience joins your team. By week three on a normal codebase, they would be shipping features. On the AI surface, they are still asking questions in DMs, and you can predict which two senior engineers they are asking. Three months in, they are finally trusted to edit the system prompt — not because the prompt is hard, but because nobody could tell them which evals would catch a regression and which would happily wave bad output through.

This is not a hiring problem or a documentation problem in the usual sense. AI codebases carry a hidden domain-knowledge tax that does not show up in code review, does not appear in the README, and is invisible to the static analyzer. The tax is paid in onboarding time, in repeated questions to the same two people, and eventually in a team that quietly bifurcates into "the people who can touch it" and "everyone else."

The cost is not theoretical. Senior engineers in this position end up spending several hours a week answering the same set of questions: why is retrieval filtered this way, what does this trace dashboard mean, which eval is load-bearing for which behavior. Their queue of actual work backs up. New hires lose momentum. And when one of those two senior engineers takes a vacation, the prompt does not get edited that week.

The Hidden Curriculum of an AI Codebase

A traditional service codebase teaches itself if you read it carefully. Function names, type signatures, and tests describe what the code does and how to extend it. New engineers can rely on the code being the spec, and on the test suite being the contract. They open a PR, the CI pipeline flags what they broke, they fix it, they ship.

An AI codebase does not work this way. The system prompt is a paragraph of natural language whose semantics are not obvious from reading it. The retrieval layer has filters whose justification lives in someone's head. The eval suite contains assertions that look reasonable but are calibrated against a model version from six months ago. The trace dashboard shows fields nobody named, with thresholds nobody documented.

Worse, much of this knowledge is correct only because of decisions that were made and then forgotten. The retrieval filter exists because a prior model leaked PII when it had access to a certain document type. The eval threshold is loose because tightening it broke a customer flow that was never written down. The system prompt has a strange-looking instruction because removing it caused a regression in week 14 of last quarter.

A new engineer staring at this code cannot infer any of this. They see a paragraph of English and a config file. The reasoning that gives those artifacts meaning is invisible. So they ask. And the only people who can answer are the ones who were there when each decision was made.

Why Documentation Doesn't Solve It

The instinctive response is "we should write more docs." This is correct in spirit and almost always wrong in execution. The docs that get written tend to describe the architecture: a diagram of the agent loop, a list of tools, an explanation of how retrieval feeds into the prompt. These docs are useful, but they are not what the new engineer is missing. The new engineer can read the code and reconstruct the architecture.

What the new engineer is missing is the why: why this prompt edit was rejected last quarter, why this eval case is non-negotiable, why this tool was removed from the catalog and never put back. This is decision history, not architecture. Most teams do not write decision history because it feels like overhead at the moment a decision is made — the decision is obvious to the people in the room, and writing it down feels like explaining yourself to nobody.

Six months later, "nobody" turns out to be the new engineer, and the decision is no longer obvious. The artifact is still there in the code. The reasoning has evaporated.

The other failure mode is the wiki page that tries to explain the entire system in one document. These pages start out comprehensive and become misleading as soon as the code drifts. Within a quarter, half the wiki is wrong, the new engineer cannot tell which half, and the senior engineers stop trusting the wiki because they have been burned by following its instructions. The wiki becomes a graveyard, and the implicit knowledge stays in the same two heads.

The Artifacts That Actually Shorten the Ramp

Loading…
References:Let's stay in touch and Follow me for more thoughts and updates