Gigawatt Data Centers Arrive: Power Is Now the Strategic Moat

The data center industry just crossed a threshold that fundamentally changes the relationship between technology and energy. The first gigawatt-scale data centers are coming online in 2026-2027, and the implications extend far beyond the companies building them.

The Gigawatt Facilities

Let’s start with what’s actually being built:

xAI’s Colossus 2 (Memphis, Tennessee): Originally planned at 150 MW, Elon Musk expanded the facility to target 1 GW+ of power capacity. The first phase brought 100,000 Nvidia H100 GPUs online for Grok training. The expansion adds liquid-cooled racks designed for next-generation GPU architectures. Memphis was chosen specifically for its access to TVA power — one of the few utilities in the US with consistent surplus generation capacity.

Meta’s Prometheus Complex (Richland Parish, Louisiana): A $10B+ investment spanning 4 million square feet, targeting 1 GW of operational power. Meta’s largest single infrastructure investment ever. Louisiana offered tax incentives, but the real draw was proximity to natural gas generation and available transmission capacity on the MISO grid.

OpenAI/Microsoft Stargate (Abilene, Texas): The most ambitious of all — a $500B joint venture with initial builds targeting 1.2 GW. Texas was selected for its deregulated energy market, available land, and ERCOT grid interconnection flexibility. The Abilene site offers something rare: the ability to build dedicated transmission infrastructure without navigating multi-state utility bureaucracies.

Google’s Quantum Valley Campus (The Dalles, Oregon expansion): Google’s expansion of its existing Columbia River facility targets 800 MW-1 GW, leveraging the region’s abundant hydroelectric power. Google has been building in The Dalles since 2006, but the scale of the current expansion dwarfs everything that came before.

These are not theoretical plans. Construction is underway. Steel is going up. Utility interconnection agreements are signed. We are watching the physical infrastructure of AI being built in real time.

The Power Constraint Bottleneck

Here’s the number that keeps me up at night: North America has approximately 27 GW of confirmed AI data center capacity in various stages of planning and construction. But the grid constraints are brutal:

  • $162 billion in planned data center projects have been delayed or blocked due to insufficient grid capacity, according to recent utility filings and industry reports
  • Average time from utility interconnection request to energization: 4-7 years in most US markets
  • PJM Interconnection (the grid operator for 13 eastern states) has a queue of 260 GW of generation and load requests — the vast majority will never be built because the transmission infrastructure doesn’t exist
  • Dominion Energy in Virginia (the heart of Data Center Alley) has publicly stated it cannot guarantee power for new large-scale data center projects before 2030

The constraint isn’t generation — it’s transmission and distribution. The US has plenty of power generation capacity in aggregate, but it’s not where the data centers need it. Building new high-voltage transmission lines takes 7-12 years due to permitting, environmental review, and the legal challenges of crossing multiple jurisdictions.

Microsoft has taken the most creative approach: exploring superconducting power cables that can transmit power with near-zero loss over long distances, potentially bypassing the transmission bottleneck entirely. The technology is real but unproven at data center scale. If it works, it would decouple data center location from proximity to generation — a revolutionary change.

Power Is the New Talent

For twenty years, tech companies chose their locations based on talent availability. Silicon Valley, Seattle, Austin, New York — the geographic strategy was driven by where engineers wanted to live. That era is ending.

The gigawatt data center buildout has inverted the equation. Location decisions are now driven primarily by:

  1. Available power capacity: Can the local grid deliver 500 MW-1 GW without multi-year upgrades?
  2. Power cost: Industrial electricity rates vary from $0.03/kWh (parts of the Southeast) to $0.15/kWh (California, Northeast). At gigawatt scale, every cent per kWh represents $87M/year.
  3. Water availability: Cooling a 1 GW data center requires enormous water resources — roughly 5-7 million gallons per day for evaporative cooling. Water scarcity is already blocking projects in the Southwest.
  4. Regulatory environment: Permitting speed, tax incentives, and utility cooperation vary enormously by state.

This is why Memphis, Abilene, and rural Louisiana are the new centers of AI infrastructure — not San Francisco or Seattle. The talent can work remotely. The power cannot be transmitted (yet).

The IEA Projection

The International Energy Agency’s latest projections are staggering: data center electricity consumption is projected to reach 945 TWh by 2030 — roughly equivalent to Japan’s entire national electricity consumption. For context:

  • Global data center electricity consumption in 2022: ~460 TWh
  • Projected 2030: ~945 TWh
  • That’s a doubling in 8 years, driven almost entirely by AI training and inference workloads

To put 945 TWh in perspective: that’s more electricity than the entire country of Germany uses. It’s roughly 3.5% of projected global electricity generation in 2030. And these projections may be conservative — they were made before several gigawatt-scale facilities were announced.

What This Means for the Industry

The implications cascade through the entire technology ecosystem:

For hyperscalers: Power procurement is now the core strategic function, not a facilities management concern. Microsoft, Google, Meta, and Amazon all have dedicated energy teams that rival mid-size utility companies in sophistication. These companies are signing 15-20 year power purchase agreements (PPAs) worth billions, investing in nuclear (Microsoft’s deal with Constellation Energy for Three Mile Island restart, Amazon’s investments in SMRs, Google’s partnership with Kairos Power), and exploring entirely new generation technologies.

For cloud customers: Available compute capacity is increasingly constrained by power, not silicon. AWS, Google Cloud, and Azure are already rationing GPU instances in power-constrained regions. If your workload requires guaranteed GPU capacity, you need to think about where that capacity is physically located and whether the power exists to sustain it.

For the energy industry: Tech companies are becoming the largest single customers for utilities and energy developers. The power dynamics (pun intended) between tech and energy are shifting. Utilities that can deliver fast interconnection and reliable power are gaining leverage they haven’t had in decades.

For policymakers: The concentration of 27 GW of new electrical load in a handful of locations creates grid reliability risks that we haven’t seen before. A single gigawatt data center represents a load equivalent to a mid-size city. When it comes online, the grid must be ready. When it goes offline (for maintenance or failure), the grid must absorb the swing. This is a new category of grid management challenge.

The age of “build a data center wherever you want” is over. Power is the strategic moat, and the companies that secured it early — through long-term PPAs, utility relationships, and strategic site selection — will have a structural advantage for the next decade. Everyone else is fighting for what’s left.

What are you seeing in your organizations? How is power availability affecting your infrastructure decisions?

David, excellent overview of the power infrastructure dynamics. I want to add the climate dimension, because the environmental impact of gigawatt-scale data centers is staggering — and the industry’s sustainability claims deserve scrutiny.

The Carbon Math

Let’s be direct about the numbers:

A 1 GW data center operating at 90% capacity factor consumes approximately 7.9 TWh of electricity per year. The carbon impact depends entirely on the power source:

  • Coal-powered: ~3.5 million tons CO2/year (roughly equivalent to the annual emissions of 760,000 cars)
  • Natural gas combined cycle: ~1.8 million tons CO2/year
  • Grid average (US): ~2.9 million tons CO2/year (the US grid is still ~60% fossil fuels)
  • 100% renewable: ~0 direct emissions (though lifecycle emissions from manufacturing solar panels and wind turbines add some)

When you multiply this across 27 GW of planned capacity, the potential annual emissions range from zero (if truly powered by renewables) to 94.5 million tons CO2 (if coal-powered). For context, that upper bound would represent about 1.5% of total US greenhouse gas emissions from a single industry sector that barely existed at this scale five years ago.

The Additionality Problem

Here’s where the corporate sustainability claims get complicated. Every major hyperscaler claims to be “100% renewable” or “carbon neutral” or “carbon negative.” Let’s look at what that actually means:

What they’re doing: Signing power purchase agreements (PPAs) for renewable energy projects — solar farms, wind farms, sometimes battery storage. These PPAs are real financial commitments that do fund new renewable development.

The additionality question: Is the renewable energy they’re buying additional to what would have been built anyway? Or are they simply purchasing Renewable Energy Certificates (RECs) from projects that already exist or were already economically viable?

The distinction matters enormously:

  • Additional: Tech company signs a PPA for a new solar farm that wouldn’t have been built without the guaranteed revenue. The solar farm produces clean energy that displaces fossil fuel generation. Net positive for the climate.
  • Non-additional: Tech company buys RECs from an existing wind farm that was already operating. The wind farm keeps running (it would have anyway), the tech company gets to claim “renewable” energy, but the grid’s actual generation mix doesn’t change. The data center is still drawing power from whatever the grid provides — which in many locations is predominantly fossil fuels.

The Data on Actual vs. Claimed Renewable Usage

I’ve been tracking this closely, and the gap between claims and reality is significant:

Google has been the most transparent. In their 2025 environmental report, they disclosed that their 24/7 carbon-free energy (CFE) matching rate — meaning the percentage of their actual hourly electricity consumption matched by carbon-free generation in the same grid region — was approximately 64% globally. That’s up from 61% in 2023, and it’s genuinely impressive, but it’s a long way from “100% renewable.” Google deserves credit for using the CFE metric rather than the misleading annual REC matching that most companies use.

Microsoft reported “100% renewable” for their data centers in 2025, but this is based on annual REC matching, not hourly matching. When you look at the actual grids where their data centers operate — Virginia (heavy natural gas), Iowa (decent wind, but still gas backup), Arizona (solar during the day, gas at night) — the real-time carbon intensity of their power supply is significantly higher than the REC-adjusted figure suggests. Microsoft has committed to 100% CFE matching by 2030, which is more honest, but they’re not there yet.

Meta is in a similar position to Microsoft — annual REC matching gives them the “100% renewable” claim, but the Richland Parish facility in Louisiana will draw heavily from the MISO grid, which is approximately 50% natural gas, 20% coal, and 20% wind. Until dedicated renewable generation comes online and is directly interconnected, that 1 GW facility is running on a fossil-fuel-heavy grid regardless of what the REC certificates say.

xAI hasn’t made significant renewable energy commitments for Colossus 2 that I’m aware of. TVA’s generation mix is approximately 40% nuclear, 25% natural gas, 15% coal, and 20% hydro/renewables. So the Memphis facility is relatively low-carbon thanks to nuclear, but it’s not zero-carbon.

The Uncomfortable Scale Question

Even if every gigawatt data center achieved true 100% renewable energy (hourly matched, additional, zero fossil backup), there’s a broader systems question: is deploying 27 GW of new renewable capacity for AI data centers the best use of that clean energy?

That 27 GW of renewable generation could instead displace fossil fuel generation serving residential and industrial loads, reducing emissions more efficiently than powering data centers that didn’t exist three years ago. Every solar panel powering an AI training cluster is a solar panel not displacing coal for someone’s home electricity.

I’m not arguing against AI development — I use AI tools in my own climate research daily. But we should be honest about the tradeoffs. The tech industry’s narrative of “we’re green because we buy PPAs” obscures a real opportunity cost: clean energy is a finite resource in the near term, and choosing to allocate it to AI training rather than fossil fuel displacement is a climate decision, not just a business decision.

What Would Genuine Climate Leadership Look Like?

  1. Adopt 24/7 CFE matching (like Google) instead of annual REC matching. This is the honest metric.
  2. Fund overbuilding: For every GW of data center load, fund 1.5-2 GW of new renewable generation — enough to power the facility AND displace fossil generation elsewhere.
  3. Invest in long-duration storage: Solar and wind are intermittent. Without 8-12 hour storage, nighttime and windless hours still require fossil backup.
  4. Water transparency: Publish water consumption data. Evaporative cooling at gigawatt scale has real implications for water-stressed communities.
  5. Community benefit sharing: These facilities are being built in rural communities. Ensure the economic benefits extend beyond tax revenue to include affordable clean energy access for local residents.

The gigawatt data center era is here. The question is whether the industry will actually power it cleanly, or just buy enough certificates to claim they did.

David’s macro analysis is excellent, and Elena’s climate scrutiny is necessary. I want to zoom into what this means at the software and infrastructure engineering level, because building systems for gigawatt-scale facilities fundamentally changes how we think about infrastructure design.

Power-Aware Scheduling Is Now a First-Class Concern

At my previous role at Google Cloud AI, power management was a facilities concern — something the data center operations team handled. Software engineers didn’t think about it. That separation is collapsing at gigawatt scale.

When your training cluster draws 200-500 MW and your facility’s power allocation varies based on grid conditions, renewable availability, and demand response commitments, workload scheduling must become power-aware. This isn’t theoretical — it’s happening right now at every major AI lab.

Here’s what power-aware scheduling looks like in practice:

Dynamic power budgeting: The facility has a total power allocation from the grid (say, 800 MW out of 1 GW capacity). Within that allocation, different workloads get power budgets that adjust in real-time:

Training jobs (high priority): 500 MW base, flex 400-600 MW
Inference serving (latency-sensitive): 150 MW base, flex 120-180 MW
Batch processing (deferrable): 100 MW base, flex 0-200 MW
Infrastructure overhead (cooling, networking, storage): 50 MW fixed

When grid conditions tighten (peak demand hours, renewable dip, grid operator curtailment request), the scheduler reduces power to deferrable workloads first, then throttles training jobs by reducing GPU clock speeds or pausing checkpointing operations. Inference serving gets protected because latency SLAs can’t flex.

Carbon-intensity-aware scheduling: Building on Elena’s point about actual vs. claimed renewable usage — Google’s internal carbon-intelligent computing platform shifts deferrable workloads to time windows and grid regions where carbon intensity is lowest. At gigawatt scale, this optimization becomes massively impactful. Moving a 100 MW batch job from a 400g CO2/kWh time window to a 100g CO2/kWh window saves 300 tons of CO2 per hour. Over a year, that’s material.

Thermal-aware placement: Cooling is 30-40% of total data center power at scale. Workload placement that considers thermal density — spreading hot workloads across cooling zones rather than concentrating them — can reduce cooling power by 15-20%. At 1 GW, that’s 45-60 MW saved, which is the power consumption of a small data center by itself.

Workload Migration Between Power Availability Zones

This is the next frontier. As companies operate multiple gigawatt-scale facilities in different grid regions, workload migration based on power availability becomes a core infrastructure capability.

The architecture looks like this:

  1. Power signal ingestion: Real-time feeds from grid operators (ERCOT, PJM, MISO, CAISO), internal facility power management systems, and renewable generation forecasts
  2. Workload mobility layer: Training checkpoints stored in distributed storage (S3/GCS) that’s accessible from multiple facilities. When power signals indicate a favorable shift, the scheduler can resume training from the latest checkpoint at a different facility
  3. Network fabric: High-bandwidth interconnects between facilities (100-400 Gbps dedicated links) that enable checkpoint transfer in minutes rather than hours
  4. State management: For inference workloads, model weights cached at multiple facilities with request routing that considers power availability alongside latency

The technical challenge is checkpoint size. A frontier model training run might have checkpoints of 5-20 TB. Transferring that over even a 400 Gbps link takes 2-7 minutes. For training jobs, that’s acceptable — you lose a few minutes of training time. For inference, you need the model pre-cached at all potential serving locations.

Infrastructure Engineering Is Becoming Energy Engineering

I’ve been saying this for the past year, and it keeps getting more true: the skillset required for AI infrastructure engineering is converging with energy systems engineering.

My team’s hiring profile has changed dramatically. Two years ago, we hired distributed systems engineers and ML infrastructure specialists. Now we’re also hiring:

  • Power systems engineers who understand grid interconnection, demand response, and power quality management
  • Energy market analysts who can model electricity pricing, PPA structures, and carbon credit markets
  • Thermal engineers who can optimize liquid cooling systems, waste heat recovery, and climate-responsive facility design
  • Control systems engineers who can build real-time power management systems that respond to grid signals in milliseconds

The job title might still say “infrastructure engineer,” but the actual work increasingly involves energy optimization, thermal dynamics, and power systems management. If you’re an infrastructure engineer who hasn’t started learning about power systems, I’d strongly recommend it — this is where the field is heading.

The Open Questions

A few things I’m still trying to figure out:

  1. Stranded compute risk: What happens when a facility’s power contract expires or grid conditions change? A gigawatt facility is a $10B+ investment with a 20-year lifespan. If the power economics shift unfavorably in year 8, you’ve got a massive stranded asset. How do you hedge this?

  2. Multi-facility training: Can we efficiently train a single model across multiple geographically distributed facilities, each with its own power constraints? The communication overhead of distributed training already dominates at cluster scale — adding inter-facility latency makes this even harder. But the power availability benefits could be enormous.

  3. Edge inference and power: As inference moves to the edge (on-device, on-premise), does the gigawatt centralization trend for training create a corresponding decentralization trend for serving? The power dynamics for inference are fundamentally different from training.

The era of treating power as an infinite, invisible resource is over. Infrastructure engineers who understand this transition will define the next decade of system design.

This thread is laying out the macro picture beautifully — David on infrastructure buildout, Elena on climate accountability, Alex on engineering implications. Let me bring this down to the level where most of us actually operate: companies that consume cloud compute but don’t build our own data centers. Because for us, the gigawatt data center era means something very specific — cloud capacity scarcity is here, and it’s getting worse.

The Capacity Crunch Is Already Real

My company runs a mid-stage SaaS platform with significant ML workloads — recommendation systems, NLP pipelines, fraud detection models. We’re not training frontier models, but we consume meaningful GPU compute for inference and fine-tuning. Here’s what we’ve experienced in the past 12 months:

  • AWS us-east-1 (Virginia): P5 instances (H100 GPUs) are frequently unavailable on-demand. Reserved instance pricing has increased 18% year-over-year. AWS capacity advisors have told us directly that they’re power-constrained in Northern Virginia and can’t guarantee scaling beyond our current reservation.

  • GCP us-central1 (Iowa): A100 and H100 availability has been inconsistent. We’ve had workloads queued for 4-6 hours waiting for GPU instances during peak periods. Google’s capacity team confirmed this is partly a power allocation issue at their Iowa facilities.

  • Azure East US: We evaluated Azure as a secondary provider. The GPU instance waitlist for enterprise customers was 8-12 weeks. For a company trying to ship ML features on a quarterly roadmap, that’s a non-starter.

This isn’t a temporary shortage. David’s analysis of grid constraints makes it clear: the power bottleneck means cloud providers physically cannot build capacity fast enough to meet demand in their most popular regions.

Power Availability Is Now a Cloud Selection Criterion

A year ago, our multi-cloud strategy was driven by three factors: pricing, service capabilities, and geographic proximity to our users. Power availability wasn’t even on the list.

Today, it’s our fourth criterion, and it’s moving up. Here’s how we think about it:

Tier 1 regions (power-abundant, new capacity coming online):

  • AWS us-south-1 (Texas) — ERCOT grid, new data center builds
  • GCP us-south1 (Dallas) — similar ERCOT advantages
  • AWS/GCP facilities near new gigawatt buildouts where surplus capacity may be available

Tier 2 regions (adequate power, but constraints emerging):

  • AWS us-west-2 (Oregon) — hydroelectric, but increasingly contested
  • GCP us-central1 (Iowa) — wind power, but demand outpacing supply

Tier 3 regions (power-constrained, avoid for GPU workloads):

  • AWS us-east-1 (Virginia) — Dominion Energy capacity maxed
  • Azure East US — similar constraints

We’ve started actively migrating GPU workloads away from Tier 3 regions. The latency penalty of serving from Texas instead of Virginia is 15-25ms for our East Coast users — not ideal, but acceptable for most ML inference workloads. The alternative is not having compute at all during peak demand.

Practical Advice for CTOs Navigating This

If you’re running a company that depends on cloud GPU compute, here’s what I’d recommend based on what we’ve learned:

1. Lock in Reserved Capacity Now

The GPU reserved instance market is tightening. If you know your baseline GPU needs for the next 12-24 months, commit now. We locked in 3-year reservations for our core inference fleet and the pricing was 30% better than what’s available today, just six months later. This is a supply-constrained market — early commitment gets better terms.

2. Build Region Flexibility Into Your Architecture

If your ML serving infrastructure is hardcoded to a single region, you’re exposed. We invested 3 months of engineering time building a multi-region inference platform that can route requests to whichever region has available capacity. The engineering cost was significant (~$400K in engineering time), but it’s already paid for itself in avoided capacity shortfalls.

3. Evaluate Specialized GPU Cloud Providers

CoreWeave, Lambda, Together AI, and other GPU-focused cloud providers have been more aggressive about securing power in new locations. Their pricing isn’t always better than hyperscalers, but their GPU availability is often better because they’re building in power-abundant regions that AWS and GCP haven’t prioritized yet. We now run 20% of our fine-tuning workloads on CoreWeave and the availability has been markedly better.

4. Optimize Before You Scale

Before fighting for more GPU capacity, make sure you’re using what you have efficiently. We found that:

  • 35% of our GPU inference fleet was over-provisioned (models running on H100s that could run on A10Gs)
  • Batching optimization reduced our inference GPU needs by 22%
  • Model distillation for our most common inference paths reduced GPU requirements by 40% with minimal quality impact

Total GPU compute reduction after optimization: ~45%. That’s 45% fewer GPU instances we need to fight for in a constrained market.

5. Plan for Power-Driven Pricing

Cloud GPU pricing is increasingly correlated with energy costs. ERCOT energy prices in Texas can vary 10x between off-peak and peak hours. Cloud providers are starting to reflect this in spot pricing. If your workloads can tolerate scheduling flexibility (training, batch inference, fine-tuning), you can save 40-60% by running during off-peak energy windows. We’ve built this into our batch processing pipeline and the savings are substantial.

The Bigger Picture

Alex is right that infrastructure engineering is becoming energy engineering. For those of us on the cloud consumer side, cloud strategy is becoming energy strategy. The CTOs who understand power constraints, grid geography, and energy economics will make better infrastructure decisions than those who just compare cloud service feature matrices.

The gigawatt data centers will eventually ease the supply constraint — but “eventually” is 3-5 years. In the meantime, cloud capacity is a strategic asset that requires active management, not a utility you can take for granted.