Skip to main content

The Power User Who Learned Your Prompt By Trial

· 10 min read
Tian Pan
Software Engineer

There is a user in your product right now who is having a much better experience than the median. Not because they pay more, not because they have a different tier, not because they were rolled into a different cohort. They have figured out, through patient probing, that the AI feature responds beautifully if you ask in a certain way. They know which verbs trigger the structured output. They know that a one-word follow-up gives them the terse version and a complete sentence gives them the expansive one. They know that the assistant gets defensive about certain topics unless you frame the question as a hypothetical. None of this is written down anywhere on your site. They reverse-engineered it.

The interesting thing is not that this user exists. It is that this user is now your documentation. Your AI feature has a contract with its users — an undocumented one, encoded entirely in the system prompt — and the only way anyone learns the contract is by trial. A small fraction of users have the patience to run those trials. Everyone else gets a worse product.

This is not a niche concern. Recent usage data from large consumer AI products shows that power users — roughly 10 to 20 percent of accounts — generate 50 to 70 percent of total prompts, and they send dramatically longer, more structured messages than the median user. Only a small fraction of conversations involve giving the AI a persona or attaching context. The bimodal distribution is real, and the fork in user experience happens at the prompt layer, not the feature layer.

The System Prompt Is A Hidden API

A system prompt is a specification. It defines what the model attends to, what it ignores, what tone it adopts, what tools it considers, what edge cases it refuses, and what shape its output takes. In a traditional product, that specification would live in a help center, an onboarding flow, a tooltip, a video. In an AI product, it lives in a string the user will never see.

This is structurally similar to shipping an HTTP API with no documentation and hoping developers reverse-engineer the endpoints from error messages. We would never accept that for a developer product. But we accept it for end users because the natural language interface creates the illusion that no documentation is needed — after all, the user can just type whatever they want. The model will figure it out.

Except the model will not figure it out, not reliably. The system prompt has opinions. It expects a certain framing. It has soft heuristics about what counts as a complete request, what triggers clarifying questions, what gets the long-form response and what gets the one-liner. These are real product decisions encoded in prose, and they have observable downstream effects on the user's experience. Calling them "implicit" does not make them less real. It only means users have to discover them the hard way.

A user typing a vague request and getting a mediocre answer is not failing the product. The product is failing the user by hiding the contract that would have unlocked a good answer. The patient user finds the contract. The impatient user concludes the product is mid and switches to something else.

Aggregate Satisfaction Hides Two Populations

This is where it gets dangerous for the team building the feature. The aggregate satisfaction score is a single number averaged across two populations with radically different experiences. The power users are delighted. They are leaving five-star reviews and telling their friends. The median users are confused. They are not leaving angry reviews — they just churn quietly, or they keep using the product at a low intensity, never discovering what it can do.

When you ship a change to the system prompt, the power users notice immediately because they have a mental model precise enough to detect drift. The median users do not notice anything because their mental model was too vague to register the change. Your A/B test reads as a small movement in the average. The reality is that you just rewrote the contract for a population that had memorized the previous version.

Survey data backs this up at the macro level. In recent workplace AI studies, around 86 percent of users say AI saves them time but only 65 percent say they are satisfied with how it shows up in their work. That gap — utility is real but satisfaction is moderate — is the signature of an interaction problem. The capability is there. The path to the capability is the part that is broken.

The aggregate is a lie of composition. It tells you the product is fine. It is not fine. It is excellent for one segment and frustrating for another, and the segment that is frustrated is the one you most need to learn from because they are not yet locked in.

The Ethical Cost Of A Hidden Manual

There is also a fairness story here that is easy to skip past if you are only looking at numbers. The users who become power users are not randomly distributed. They are the ones with time to experiment, with confidence to keep poking when the first attempt fails, with a mental model of LLMs from their day job or their reading. The users who give up after a mediocre first answer are disproportionately the ones for whom this is just another tool they are trying to make work between other tasks.

A product whose best path is hidden behind iterative probing is a product whose value is gated by leisure and prior expertise. That is not the diverse, accessible user base most AI products claim to be building for. The hidden prompt is a quiet bias in the funnel.

It is also a fragile arrangement for the company. The reverse-engineered manual lives in forums, Discord servers, and screenshots of clever prompts. The community writes the documentation you would not. That works until the system prompt changes. Then the community documentation goes stale and the very users who had figured out the product are now the most disoriented — because they had memorized a contract you silently broke.

Surface The Contract, Do Not Hide It

The fix is not to publish the literal system prompt. That is not the point and there are good reasons not to. The fix is to surface the parts of the contract that materially change the user's experience, in the interface, where the user is making decisions.

The patterns are not exotic. They are the same affordances that mature software products use for any complex feature:

  • Input recipes: when the user opens a fresh prompt box, show two or three concrete starting prompts that have been tuned to work well. Not generic ones like "summarize this" — specific ones aligned with the feature's actual sweet spots, the way a good cookbook gives you whole recipes rather than just a list of ingredients.
  • Guidance UI mid-prompt: when the user has typed something thin, surface gentle suggestions ("add a goal," "specify the audience," "paste the document") as enrichments rather than corrections. Recognition is easier than recall, and this lets a hesitant user pick from a menu rather than invent from scratch.
  • Structured input chips: for the few axes that really matter — tone, length, format, audience — offer chips or toggles. This pulls the most impactful prompt variables out of the freeform text and into a place the user can see and change without re-typing. It also gives the team a stable interface to evolve the prompt behind without breaking user expectations.
  • Result-side controls: after the model responds, show controls to nudge the output — shorter, more formal, with examples — so the user is not paying the cost of re-prompting from scratch.
  • A visible "how to ask" page: not the full system prompt, but a plain-language description of what the feature does well, what it does poorly, what kinds of inputs work, and what the user can expect. This is the documentation page the power users would have written if you had let them.

Each of these is a way of pulling part of the hidden contract out into the open. The team retains control over the prompt. The user gets a shorter path from intent to outcome.

Prompts Expressive Enough To Matter Are Expressive Enough To Document

The deeper principle is this: any system prompt rich enough to differentiate your product is rich enough to deserve user-facing scaffolding. If the prompt is so trivial that "just type whatever you want" really is sufficient guidance, then the prompt is not doing much for you and a competitor will out-prompt you next quarter. If the prompt is doing real work — disambiguating intent, enforcing tone, calibrating verbosity, routing to tools — then it is a feature, and features deserve UI.

The mistake teams make is treating the prompt as backstage infrastructure rather than as part of the product surface. The prompt determines what the user gets. By definition, that makes it part of what you are shipping to the user, even though no pixel of it is visible. Treating it as invisible just means the visible part of your product is reduced to a text box, which is the most generic possible interface in the industry right now. Every competitor has the same text box.

The differentiation is in the contract. If you make the contract discoverable to the median user, you compress the gap between the power user experience and the median user experience. You shorten the time-to-value for new users. You stop relying on a self-selected fraction of users to do the work of teaching the product to itself.

A Practical Audit For Your Team

Spend an hour with the actual sessions of three users whose engagement is in the 50th percentile of your feature. Not the 95th, not the 5th. The middle. Read what they typed. Read what they got back. Note where the response would have been clearly better with a small change to the input — a missing piece of context, an unspecified format, an under-constrained scope.

For each one, ask: is that gap discoverable from the current UI? Or did the user need to already know the prompt's preferences to get there? Every gap that requires prior knowledge is a place where the prompt has become a hidden API the median user cannot see.

Then count them. That count is the documentation debt your product is currently outsourcing to its most patient users. Pay it down with one chip, one recipe, one inline suggestion at a time. The 200-message-per-day power users will keep being power users. The median user will quietly start having a much better day.

References:Let's stay in touch and Follow me for more thoughts and updates