Skip to main content

One post tagged with "behavioral-cloning"

View all tags

The Chatbot That Inherited Your Support Team's Worst Habits

· 10 min read
Tian Pan
Software Engineer

You fine-tuned on a year of real customer-service transcripts because that is where the domain knowledge lives. The model now sounds like your support team. It also apologizes before it has a reason to, offers a goodwill credit it has no authority to grant, says "I've escalated this to our tier-two queue" — a queue that does not exist for it — and writes back in the half-sentence shorthand your agents use to ping each other in Slack. Domain accuracy on your eval set looks great. Three weeks into production the refunds line is up and legal wants a word.

The chatbot did not go rogue. It learned exactly what you trained it on. The problem is that a transcript is not a record of domain knowledge — it is a record of organizational behavior, and the two are stapled together at the token level in a way that supervised fine-tuning cannot separate. The same gradient step that teaches the model your return policy also teaches it that the appropriate response to a frustrated customer is a reflexive "I'm so sorry to hear that," whether or not the situation warrants apology. Your agents had reasons for those reflexes. The model has only the surface.