Skip to main content

One post tagged with "engineering-practices"

View all tags

Deleting an Eval Case Is a Decision, Not Cleanup

· 10 min read
Tian Pan
Software Engineer

Every eval suite eventually gets pruned. Someone notices the suite takes nine minutes to run, costs $40 a pass, and is full of cases nobody remembers writing. They open a PR titled "clean up stale eval cases," delete forty entries that "don't seem relevant anymore," and the CI run drops to four minutes. The PR gets a thumbs-up. Nobody objects, because deleting tests looks like maintenance.

It is not maintenance. Every eval case is a guarantee the team made to itself: this failure mode will not recur silently. Deleting the case retires the guarantee. The pass rate does not change, the dashboard stays green, and the only thing that disappears is the team's memory that the guarantee ever existed. Six months later a model migration reintroduces exactly the regression a deleted case was guarding, the postmortem rediscovers a lesson the team already paid for once, and someone writes "we should add a test for this" — the test that was deleted in the cleanup PR.