The AI Engineering Perf Packet: Making Stochastic Work Legible at Promotion Review
A senior engineer walks into the promotion calibration meeting. They shipped a fine-tuned reranker that lifted retrieval quality eight points. They built the eval harness that turned a two-week QA cycle into a one-hour CI gate. They authored the prompt change that drove a two-point conversion lift. By any reasonable measure, they had a defining year.
They don't get promoted. The packet, as written, reads like "I tuned some numbers." The colleague next to them — who shipped a CRUD feature behind a launch banner with QPS, latency, and a Friday demo — gets the nod instead. The committee is not malicious. It is using a vocabulary it has, applied to a packet that didn't translate the work into that vocabulary.
This failure mode is now common enough to be a pattern. AI engineering work doesn't decompose cleanly into the artifacts that calibration committees were trained to evaluate. The packet template was written for deterministic systems shipped in deterministic ways, and the engineers who do the most leveraged work in the AI stack are paying the tax.
