Skip to main content

2 posts tagged with "system-prompt"

View all tags

Negative Prompts Are Code Smells: Why Every 'Don't' in Your System Prompt Is Technical Debt

· 10 min read
Tian Pan
Software Engineer

Open the system prompt of any production AI feature that has been live for more than three months. Count the negative clauses — the "do not," "never say," "avoid," "under no circumstances," "you must not." If the count is in the double digits, you are not looking at a system prompt. You are looking at a graveyard. Each tombstone marks a specific user complaint, a specific incident report, a specific Slack message from a stakeholder who saw the model do something embarrassing. The team patched the surface and moved on, and now the prompt reads like a legal disclaimer with a personality grafted onto the front.

Negative prompts are code smells. Not in the metaphorical sense — in the literal one. They are the prompt-engineering equivalent of a try/except block that swallows an exception, a config flag with no documentation, a // TODO: refactor this from 2022. They work, kind of, until they don't. And the failure mode they hide is almost always more interesting than the failure they were added to suppress.

Prompt Credit Assignment: Finding the Dead Weight in Your System Prompt

· 11 min read
Tian Pan
Software Engineer

Most teams discover their system prompt has a weight problem the same way — a cost review, a latency spike, or an engineer who finally reads the thing end to end. What they find is typically a 2,000-token document that grew organically over six months, with three versions of "be concise" scattered across different sections, instructions that reference a product workflow that was deprecated in February, and a dozen rules that the model visibly ignores on every run. The prompt is large. Most of it isn't doing anything.

This is the prompt credit assignment problem: figuring out which instructions in a multi-thousand-token system prompt actually drive model behavior, and which are just dead weight that burns tokens and dilutes attention. The bad news is that most teams skip this entirely — they add instructions when behavior breaks and never subtract. The good news is there is a repeatable engineering discipline for it.