Skip to main content

One post tagged with "training-data"

View all tags

Feedback Surfaces That Actually Train Your Model

· 10 min read
Tian Pan
Software Engineer

Most AI products ship with a thumbs-up/thumbs-down widget and call it feedback infrastructure. It isn't. What it is, in practice, is a survey that only dissatisfied or unusually conscientious users bother completing — and a survey that tells you nothing about what the correct output would have looked like.

The result is a dataset shaped not by what your users want, but by which users felt like clicking a button. That selection bias propagates into fine-tuning runs, reward models, and DPO pipelines, quietly steering your model toward the preferences of a tiny and unrepresentative minority. Implicit signals — edit rate, retry rate, session abandonment — cover every user who touches the product. They don't require a click. They're generated by the act of using the software.

Here's how to design feedback surfaces that produce high-fidelity training signal as a natural side effect of product use, and how to route those signals into your training pipeline.