Skip to main content

5 posts tagged with "engineering-management"

View all tags

The Three Tastes of an AI Engineer: Why Prompts, Evals, and Guardrails Don't Live in the Same Head

· 11 min read
Tian Pan
Software Engineer

The three best AI engineers I have hired this year would all fail each other's interviews. The one who writes prompts that survive a model upgrade has never written a useful eval case in her life. The one who designs eval sets that catch the failures that matter writes prompts that other engineers refuse to extend. The one who designs guardrails that fail closed without choking the happy path has opinions about the other two that I cannot print here.

The job ladder calls all three of them "AI engineer." The calibration committee compares their promo packets as if they had been doing the same job. They have not.

The AI Interview Collapse: Engineering Hiring Has Lost Its Signal

· 11 min read
Tian Pan
Software Engineer

The signal is gone. In a recent audit of 19,368 technical interviews, 38.5% of candidates were flagged for AI-assisted cheating, with technical roles hitting 48% and junior candidates cheating at nearly double the rate of senior ones. More damning: 61% of detected cheaters scored above the passing threshold. Without the detection layer, they would have advanced. The interview, as an instrument, is no longer measuring what it was designed to measure.

This is not a moral panic about kids these days. It is a mechanical failure of the instrument. The technical interview was calibrated for a world in which a candidate, under time pressure, in an unfamiliar environment, had to produce correct code from memory and first principles. That constraint — the thing that made the signal legible — has been dissolved by a free-tier chat window running on a second device. Every company that still runs a LeetCode-style screen is now paying to sort candidates on a test the test-taker can trivially outsource.

The AI Engineering Career Ladder: Why Your SWE Leveling Framework Is Lying to You

· 10 min read
Tian Pan
Software Engineer

A senior engineer at a mid-sized startup recently got a mediocre performance review. Their velocity was inconsistent — some weeks they shipped a ton of code, others almost nothing. Their manager, trained on traditional SWE frameworks, marked them down for output variability. Six weeks later, that engineer left for a competing team. What the manager didn't understand: the engineer's "slow" weeks were spent building evaluation infrastructure that prevented three categories of silent failures. Without it, the product would have been subtly broken in ways nobody would have noticed for months.

This pattern is playing out across engineering orgs right now. Teams that built their career ladders for deterministic software systems are applying those same frameworks to AI engineers — and systematically misidentifying their best people.

The Metrics Translation Problem: Why Technically Successful AI Projects Lose Funding

· 10 min read
Tian Pan
Software Engineer

Your model achieved 91% accuracy on the held-out test set. Latency is under 200ms at p95. You've cut the error rate by 40% compared to the previous rule-based system. By every technical measure, the project is a success. Six months later, leadership cancels it.

This is not a hypothetical. Eighty percent of AI projects fail to deliver intended business value, and the majority of those failures are not caused by model performance. They are caused by the gap between what engineers measure and what decision-makers understand. The technical team speaks a language that executives cannot evaluate — and in the absence of comprehensible signal, leadership defaults to skepticism.

The metrics translation problem is not a communication soft skill. It is an engineering discipline that most teams treat as optional until the funding review.

The AI Skills Inversion: When Junior Engineers Outperform Seniors on the Wrong Metrics

· 8 min read
Tian Pan
Software Engineer

A junior engineer on your team just shipped three features in a week. Your senior engineer shipped half of one. The dashboards say the junior is 6x more productive. The dashboards are lying.

This is the AI skills inversion — a measurement illusion where AI coding assistants make junior engineers look dramatically more productive on surface metrics while masking a deeper problem. The features ship faster, but the architecture degrades. The PRs multiply, but the system coherence erodes. And organizations that trust their dashboards over their judgment are promoting the wrong behaviors and losing the wrong people.