Our Ingress NGINX Migration Plan: Gateway API vs Alternative Controllers

With less than 2 months until Ingress NGINX loses security support, I wanted to share our migration planning process. This is a significant undertaking for any organization with substantial Kubernetes infrastructure.

Our Starting Point

Current State:
- 15 Kubernetes clusters (6 production, 9 dev/staging)
- 340+ Ingress resources
- 8 teams affected
- Heavy use of NGINX-specific annotations
- Custom snippets for rate limiting and auth

This isn’t a weekend project.

Evaluation Criteria

We evaluated alternatives against these criteria:

Criterion Weight Why It Matters
Long-term viability 30% Don’t want another migration in 2 years
Feature parity 25% Need to support current use cases
Migration complexity 20% Engineering time cost
Community/Support 15% Help when things break
Performance 10% Traffic-critical workloads

The Candidates

1. Gateway API (with Envoy-based implementations)

Pros:

  • Kubernetes-native, official successor to Ingress
  • Expressive routing model (HTTPRoute, GRPCRoute)
  • Multiple implementations available (Envoy Gateway, Istio, Cilium)
  • Active development, growing ecosystem

Cons:

  • Newer, less battle-tested at scale
  • Different mental model from Ingress
  • Some NGINX-specific features need workarounds

Our Assessment: Best long-term choice, but steeper learning curve.

2. Kong Ingress Controller

Pros:

  • Feature-rich, enterprise-ready
  • Plugin ecosystem for auth, rate limiting, etc.
  • Good migration tooling from NGINX
  • Commercial support available

Cons:

  • Vendor lock-in concerns
  • Cost for enterprise features
  • Heavier resource footprint

Our Assessment: Good option if you need enterprise support, but introduces new vendor dependency.

3. Traefik

Pros:

  • Auto-discovery of services
  • Built-in Let’s Encrypt integration
  • Good documentation
  • Active open source community

Cons:

  • Different configuration model
  • Some performance concerns at high traffic
  • Less enterprise-focused

Our Assessment: Great for smaller deployments, questions at our scale.

4. F5/NGINX Ingress Controller (Commercial)

Pros:

  • Familiar NGINX configuration model
  • Commercial support from F5
  • Easier migration from community NGINX

Cons:

  • Expensive licensing
  • Still NGINX (similar architectural concerns)
  • Vendor lock-in

Our Assessment: Path of least resistance, but doesn’t solve underlying issues.

Our Decision: Gateway API

We chose Gateway API for these reasons:

  1. Future-proof: It’s the Kubernetes-endorsed direction
  2. Implementation choice: Multiple backends (we chose Envoy Gateway)
  3. No vendor lock-in: Standard API means we can switch implementations
  4. Better model: Role-based configuration aligns with our team structure

Migration Approach

Phase 1: Parallel Deployment (Weeks 1-2)

# Deploy Gateway API alongside existing Ingress
# Both route to same backends
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: api-gateway
  namespace: gateway-system
spec:
  gatewayClassName: envoy-gateway
  listeners:
    - name: http
      port: 80
      protocol: HTTP
    - name: https
      port: 443
      protocol: HTTPS
      tls:
        mode: Terminate
        certificateRefs:
          - name: wildcard-cert

Phase 2: Traffic Splitting (Weeks 3-4)

  • Route 10% traffic through Gateway API
  • Monitor for errors, latency differences
  • Gradually increase to 50%, then 90%

Phase 3: Full Cutover (Weeks 5-6)

  • Route 100% through Gateway API
  • Keep Ingress NGINX as fallback (1 week)
  • Remove Ingress NGINX completely

Configuration Translation Examples

Rate Limiting

Before (NGINX annotation):

annotations:
  nginx.ingress.kubernetes.io/limit-rps: "10"
  nginx.ingress.kubernetes.io/limit-connections: "5"

After (Gateway API + BackendTrafficPolicy):

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: rate-limit-policy
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: my-route
  rateLimit:
    type: Local
    local:
      requests: 10
      unit: Second

Path-Based Routing

Before (Ingress):

spec:
  rules:
  - host: api.example.com
    http:
      paths:
      - path: /v1
        pathType: Prefix
        backend:
          service:
            name: api-v1
            port: 80

After (HTTPRoute):

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: api-routes
spec:
  parentRefs:
  - name: api-gateway
  hostnames:
  - api.example.com
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /v1
    backendRefs:
    - name: api-v1
      port: 80

Lessons Learned So Far

  1. Audit annotations first: We found 47 unique NGINX annotations in use. Many were obsolete.
  2. Start with simple services: Don’t migrate your most complex routing first.
  3. Monitoring is critical: We added extensive logging during parallel run.
  4. Team training matters: Gateway API concepts are different enough to require training.

Questions for the Community

  1. Anyone else migrating to Gateway API? What implementation did you choose?
  2. How are you handling custom NGINX configurations (snippets, lua)?
  3. What’s your rollback strategy if issues are found post-migration?

Would love to hear from others going through this process.

Excellent breakdown, Luis. Let me add some technical depth on the alternatives.

Technical Comparison Matrix

Controller Gateway API Auto-TLS Rate Limiting WebSocket Complexity
Kong Full Yes Built-in Yes Medium
Traefik Full Yes Plugin Yes Low
HAProxy Partial Manual Built-in Yes High
Envoy Gateway Native Yes Built-in Yes Medium
Cilium Native Yes eBPF-based Yes High

Migration Gotchas We Hit

1. Annotation Translation
Ingress NGINX annotations don’t map 1:1 to Gateway API. We built a custom annotation converter, but some features required HTTPRoute filters instead.

2. TLS Configuration
Gateway API separates Gateway (infrastructure) from HTTPRoute (application). This is cleaner but requires coordination between platform and app teams.

3. Performance Differences
We benchmarked at production load:

  • Kong: 2% latency increase, but better under connection storms
  • Traefik: Nearly identical, slightly higher memory
  • Envoy Gateway: Best raw performance, steeper learning curve

4. The Snippets Problem
If you’re using custom NGINX snippets (many teams are), you’ll need to find equivalent configurations. Some simply don’t exist - we had to rewrite 15% of our edge logic.

My Recommendation

For most teams: Start with Traefik or Kong. They’re production-proven with huge communities.

For platform teams with Envoy experience: Envoy Gateway is the most future-proof choice and aligns with Istio/service mesh investments.

For security-conscious orgs: Cilium with eBPF gives you network policy integration that others can’t match.

The key insight: Gateway API is the abstraction layer, not the implementation. Choose your controller based on your operational capabilities, not feature checklists.

Critical security considerations that every team needs to address during migration:

The CVE-2025-1974 Wake-Up Call

The “IngressNightmare” vulnerability (CVSS 9.8) should inform your migration strategy. Key lessons:

  1. Snippets were always a security anti-pattern - Custom NGINX configurations bypassed validation and created RCE vectors
  2. 41% exposure - This wasn’t a niche problem; it affected nearly half of internet-facing Kubernetes clusters
  3. Annotation injection - Attackers could craft malicious annotations to execute arbitrary code

Migration Security Checklist

Before Migration

  • Audit existing snippet usage - document every custom NGINX config
  • Identify annotation-based configurations that bypass standard controls
  • Review NetworkPolicies - they may be relying on ingress-nginx labels
  • Check for rate limiting, WAF rules, and custom security headers

During Migration

  • Never run dual controllers without traffic isolation
  • Implement Gateway API TLS termination with separate cert management
  • Validate RBAC for new controller - avoid cluster-admin shortcuts
  • Test mTLS configuration if using service mesh integration

After Migration

  • Remove old Ingress NGINX completely - don’t leave it dormant
  • Update vulnerability scanning to cover new controller
  • Verify security headers are preserved in Gateway API configuration
  • Document the new trust boundary architecture

The Security Advantage of Gateway API

Gateway API’s separation of concerns is actually a security win:

  • GatewayClass: Cluster admin controls infrastructure (no app dev access)
  • Gateway: Platform team manages TLS, ports, protocols
  • HTTPRoute: App teams only control routing, not infrastructure

This RBAC-friendly design was a direct response to the footguns in Ingress. The maintainer burnout didn’t just affect features - it affected the security review process. Fewer maintainers = slower CVE response = longer exposure windows.

Don’t rush this migration, but don’t delay it either. March 2026 is a hard deadline, and running unpatched Ingress NGINX after that is a compliance and security liability.

Important perspective from the product side: this migration has customer-facing implications that need to be planned for.

Customer Impact Assessment

Before we dive into the technical migration, product teams need to answer these questions:

1. What’s Our Exposure?

  • How many customer-facing services route through Ingress NGINX?
  • What’s our SLA obligation during the migration window?
  • Do we have customers with contractual uptime guarantees that need notification?

2. Communication Plan

Depending on your customer base:

  • Enterprise B2B: Proactive communication about planned maintenance windows
  • Consumer SaaS: Status page updates during cutover
  • API products: API version negotiation, deprecation notices

3. Feature Parity Concerns

Some things that might affect product behavior:

  • Rate limiting changes: Different controllers implement rate limiting differently - validate that your tier limits work the same way
  • Timeout behaviors: Default timeouts vary - long-running operations (file uploads, report generation) might break
  • WebSocket handling: Real-time features need explicit testing
  • Path rewriting: URL patterns customers depend on must work identically

The Migration Window Trade-off

Luis’s phased approach is solid, but product needs to weigh in on timing:

Option A: Gradual migration over 4-6 months

  • Pro: Lower risk per change
  • Con: Longer period of dual-stack complexity
  • Best for: Products with high SLA sensitivity

Option B: Aggressive 6-week sprint

  • Pro: Clean break, faster debt resolution
  • Con: Higher per-change risk
  • Best for: Products with strong testing infrastructure

My Recommendation

Treat this as a product reliability initiative, not just infrastructure work. That means:

  1. Product managers should own the customer communication plan
  2. QA should sign off on feature parity testing
  3. Support teams need runbooks for new error patterns
  4. Success metrics should include customer-reported issues, not just internal SLIs

The worst outcome is a technically successful migration that creates customer confusion or silent feature degradation.