3 posts tagged with "llm-gateway"

The Inter-Team Token Budget War: When Your AI Platform Team Becomes a Treasury Department

May 10, 2026 · 10 min read

Software Engineer

The team that built your internal LLM gateway scoped it for "rate limiting and audit." Eighteen months later, the same team is running a quarterly allocation meeting, mediating a quota dispute between two product groups, and discovering that the architecture they shipped to solve a capacity problem now functions as the company's internal AI treasury. Nobody chartered them for this role. Nobody removed it from their plate either.

This is the trajectory every AI platform team is on, and most of them get to the political economy phase before they have a policy, a sponsor, or even the telemetry to defend a decision. The technical work — request routing, key management, retries — is the easy half. The hard half is that finite provider quota plus three product teams with launch deadlines is a budget allocation system, and the team running the gateway is the one being asked to allocate.

AI Shadow IT: When Product Teams Build Their Own LLM Proxy

April 28, 2026 · 11 min read

Tian Pan

Software Engineer

The shadow IT incident your platform team is going to investigate in Q3 already happened in January. It looks like this: a senior engineer on a product team has a launch this month. The platform team's "official" LLM gateway is on the roadmap for "next quarter." So the engineer creates a corporate credit card OpenAI account, drops the API key into a .env file, ships the feature, and hits the public deadline. The launch is a success. Six months later, the FinOps team finds three vendor accounts nobody can attribute, the security team finds prompts containing customer data routed to a region not covered by the data processing agreement, and the platform team discovers the gateway it spent two quarters building has 14% adoption because every team that needed AI shipped without it.

This is not a security failure or a discipline failure. It is a platform-product velocity mismatch, and treating it as anything else guarantees the next gateway you ship will have the same adoption problem.

The Internal LLM Gateway Is the New Service Mesh

April 27, 2026 · 10 min read

Tian Pan

Software Engineer

Walk into any company with fifty engineers writing LLM code in production and you will find seven gateway-shaped artifacts. The recommendations team built one to route between OpenAI and Anthropic. The support-bot team wrote one to attach their prompt registry. The platform team has a half-finished proxy that handles auth but not rate limiting. The growth team has a Lambda that does PII redaction on its way out. The data-science team is calling the vendor SDK directly and nobody has told them to stop. There is no shared gateway. There are seven shared problems, each solved poorly in isolation, and a CFO who is about to ask why the AI bill grew 40% quarter over quarter with no clear owner for any of it.

This is the same architectural beat the industry hit with microservices in 2016 and 2017. A thousand external dependencies, the same shared concerns at every team — auth, retries, observability, policy — and a choice between solving them once or rediscovering them everywhere. The answer then was the service mesh. The answer now is the internal LLM gateway, and most companies are still in the rediscovering-everywhere phase.

About Tian Pan