Skip to main content

Web App Delivery Optimization

· 4 min read

Two golden rules: minimize 1) latency 2) payload

To Minimize Latency…

  • Reduce DNS lookups

    • Use a Fast DNS Provider, AVG Res Time (cloud flare < DNS Made Easy < AWS Route 53 < GoDaddy < NameCheap). NOTE: results vary in certain regions
    • DNS Cache. TTL Tradeoff = perf <> up-to-dateness
    • reduce number of 3p domains or use services with fast DNS (conflicts with domain sharding optimization for HTTP1)
    • ==DNS prfetching== <link rel="dns-prefetch" href="//www.example.com/" >
  • reuse TCP connections.

  • minimize number of HTTP redirects

  • use a CDN

    • E.g. Netflix dev their own hardware and cooperate w/ local ISPs to serve CDN
  • eliminate unnecessary resources

  • cache resources on the client

    1. HTTP cache headers
      • cache-control for max-age
        • Note, for JS files : A simple way to ensure the browser picks up changed is by using output.filename substitutions with hashes. Webpack Caching
      • expires
        • If both Expires and max-age are set max-age will take precedence.
    2. last-modified, ETag headers to validate if the resource has been updated since we last load it
      • time-based Last-Modified response header (not used often because nginx and microservices)
      • content-based ETag (Entity Tag)
        • This tag is useful when for when the last modified date is difficult to determine.
        • done by hashing
    3. a common mistake is to set only one of the two above
  • compress assets during transfer

    • use JPEG, WebP instead of PNG
    • HTTP2 compresses headers automatically
    • nginx gzipped

To minimize Payload…

  • eliminate unnecessary request bytes

    • especially for cookies
      • even though HTTP standard does not specify a size limit on the headers / cookie, but browsers / servers often enforce …
        • 4KB limit on cookies
        • 8KB ~ 16 KB limit on headers
      • cookies are attached in every request
  • parallelize request and response processing

    • while browser is blocked on resources, preload scanner looks ahead and dispatch downloads in advance: ~20% improvement

Applying protocol-specific optimizations

  • HTTP 1.x

    • use HTTP keepalive and HTTP pipelining: dispatch multiple requests, in parallel, over the same connection, without waiting for a response in serial fashion.
    • browsers could only open a limited number of connections to a particular domain, so …
      • domain sharding = more origins * 6 connections per origin (DNS lookups may introduce more latencies)
      • bundle resources to reduce HTTP requests
      • inline small resource
  • HTTP 2.X

    • With binary framing layer introduced, we get one connection per origin with multiplexing/steam prioritization/flow control/server push, so remove 1.x optimization…
      • remove unnecessary concatenation and image splitting
      • use server push: previously inlined resources can be pushed and cached.

Tools

References