GPU Capacity Is a Roadmap Constraint: The 18-Month Contract That Decided Q3
Somewhere in your company, fourteen months ago, a finance director and a platform lead signed a multi-year accelerator commitment. They built a peak-load model from the prior quarter's telemetry, negotiated a discount of 40 to 70 percent off on-demand pricing, and locked in the cluster shape that your product roadmap now has to fit inside. Nobody on the product team was in the room. Nobody on the application engineering team saw the spreadsheet. The contract is binding, the discount only applies if the commitment is honored, and the capacity envelope it bought is now the silent ceiling on every Q3 feature your PMs are scoping.
The gap most teams don't notice until the second year: capacity contracts are roadmap decisions, but they're being made by people who don't see the roadmap, using inputs that don't include the roadmap. The product trio thinks it's choosing features from a clean priority backlog. Finance thinks it's optimizing a fixed envelope. Both are right inside their own frame, and the collision shows up in a planning meeting where an architect proposes a 70B-parameter model for the new assistant feature and the platform lead says, quietly, that the cluster is already at 85 percent and that model doesn't fit without crowding out something else.
