Skip to content

API Gateway Deep Dive

An API gateway is the single entry point for all client requests into your microservices architecture. Instead of every client knowing the addresses and protocols of every backend service, clients talk to one endpoint — the gateway — which handles authentication, rate limiting, routing, transformation, and observability before forwarding the request to the right service.

Think of it as the front desk of a hotel: guests do not wander the building looking for housekeeping, room service, and the concierge. They talk to the front desk, which routes their requests.


What an API Gateway Does

Core Responsibilities

ResponsibilityWhat It DoesWhy at the Gateway
RoutingForward requests to the correct backend serviceClients need a stable entry point
AuthenticationValidate JWT tokens, API keys, OAuth tokensCentralized auth avoids duplicating logic in every service
Rate limitingThrottle requests per client/IP/API keyProtect backends from abuse and overload
TLS terminationHandle HTTPS, present certificatesServices communicate via plain HTTP internally
Request transformationHeader injection, body transformation, protocol translationAdapt client format to backend format
Response transformationStrip internal fields, aggregate responsesReturn clean API responses
Load balancingDistribute across service instancesBuilt-in or delegated to service mesh
Circuit breakingStop forwarding to failing servicesPrevent cascade failures
CachingCache GET responsesReduce backend load for repeated queries
ObservabilityAccess logs, metrics, distributed tracingSingle point for request-level visibility

Architecture Patterns

Single Gateway (Monolithic)

The simplest approach — one gateway handles all traffic:

Pros: Simple, centralized policy, easy to reason about Cons: Single point of failure, one team owns everything, all traffic through one hop

Backend for Frontend (BFF) Pattern

Each client type gets its own gateway that tailors the API to its needs:

Use when:

  • Mobile needs different data shapes than web
  • Partner API has different auth requirements
  • Different rate limits per client type
  • Teams want autonomy over their API surface

Two-Tier Gateway

An edge gateway handles cross-cutting concerns, while per-service gateways handle service-specific logic:


Gateway Features Deep Dive

Authentication and Authorization

yaml
# Kong declarative config: JWT authentication
services:
  - name: orders-service
    url: http://orders:8080
    routes:
      - name: orders-route
        paths:
          - /api/v1/orders
        plugins:
          - name: jwt
            config:
              claims_to_verify:
                - exp
              header_names:
                - Authorization
          - name: acl
            config:
              allow:
                - orders-read
                - orders-write

Rate Limiting Strategies

StrategyHow It WorksBest For
Fixed windowCount requests per time window (e.g., 100/minute)Simple, but allows bursts at window boundaries
Sliding window logTrack timestamp of each requestAccurate, but memory-intensive
Sliding window counterWeighted combination of current + previous windowGood balance of accuracy and efficiency
Token bucketTokens refill at fixed rate, each request costs a tokenAllows controlled bursts
Leaky bucketRequests queue and drain at fixed rateSmooth output rate
yaml
# Kong rate limiting plugin
plugins:
  - name: rate-limiting
    config:
      minute: 100              # 100 requests per minute
      hour: 5000               # 5000 requests per hour
      policy: redis             # Distributed counter in Redis
      redis_host: redis
      redis_port: 6379
      limit_by: credential     # Per API key
      hide_client_headers: false
      # Response headers:
      # X-RateLimit-Limit-Minute: 100
      # X-RateLimit-Remaining-Minute: 73

TIP

For distributed rate limiting (multiple gateway instances), use Redis or a shared counter. Local in-memory counters are fast but inaccurate when you have multiple gateway pods. See Rate Limiter design for the full system design.

Request Transformation

Transform requests and responses at the gateway to decouple client and backend formats:

yaml
# Kong request transformer
plugins:
  - name: request-transformer
    config:
      add:
        headers:
          - "X-Request-Source: api-gateway"
          - "X-Forwarded-Proto: https"
        querystring:
          - "version: v2"
      remove:
        headers:
          - "X-Internal-Debug"
      rename:
        headers:
          - "X-Custom-Auth: Authorization"
      replace:
        body:
          - "user_name: username"  # Rename field

  - name: response-transformer
    config:
      remove:
        headers:
          - "X-Powered-By"
          - "Server"
        json:
          - "internal_id"         # Strip internal field from response
          - "debug_info"

Protocol Translation

The gateway can translate between protocols:

  • REST to gRPC — clients send JSON, gateway translates to protobuf
  • GraphQL to REST — gateway resolves GraphQL queries by calling multiple REST services
  • HTTP/1.1 to HTTP/2 — clients on old HTTP, backends on modern HTTP/2
  • WebSocket upgrade — gateway handles the upgrade handshake

Gateway Comparison

Feature Matrix

FeatureKongAWS API GWTykTraefikEnvoy/Istio
TypeFull API gatewayManaged serviceFull API gatewayReverse proxy + APIService proxy
DeploymentSelf-hosted / CloudAWS onlySelf-hosted / CloudSelf-hostedSelf-hosted (mesh)
LanguageLua + Nginx (OpenResty)ManagedGoGoC++
Plugin systemLua, Go, Python, JSLambda authorizersGo, Python, JS, gRPCMiddleware (Go)Wasm, Lua, C++
Admin APIREST API + declarative YAMLConsole + CloudFormationREST API + DashboardFile / Kubernetes CRDxDS API
Rate limitingBuilt-in (Redis)Built-in (per stage)Built-in (Redis)Via middlewareBuilt-in
Auth (JWT)Built-inCognito authorizerBuilt-inVia middlewareBuilt-in
gRPC supportYesYes (HTTP/2)YesYesNative
WebSocketYesYesYesYesYes
Kubernetes nativeKong Ingress ControllerN/ATyk OperatorIngress/CRD nativeIstio/Envoy sidecar
CostFree (OSS) / EnterprisePay-per-requestFree (OSS) / EnterpriseFree (OSS) / EnterpriseFree (OSS)

When to Choose Each

Kong

yaml
# kong.yml — declarative configuration
_format_version: "3.0"

services:
  - name: user-service
    url: http://users.internal:8080
    connect_timeout: 5000
    read_timeout: 30000
    routes:
      - name: users-api
        paths:
          - /api/v1/users
        methods:
          - GET
          - POST
          - PUT
        strip_path: false
    plugins:
      - name: jwt
      - name: rate-limiting
        config:
          minute: 60
          policy: redis
          redis_host: redis
      - name: prometheus
      - name: cors
        config:
          origins:
            - "https://app.example.com"
          methods:
            - GET
            - POST
          headers:
            - Authorization
            - Content-Type

Traefik (Kubernetes)

yaml
# Traefik IngressRoute (CRD)
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
  name: api-routes
spec:
  entryPoints:
    - websecure
  routes:
    - match: Host(`api.example.com`) && PathPrefix(`/users`)
      kind: Rule
      services:
        - name: user-service
          port: 8080
      middlewares:
        - name: rate-limit
        - name: jwt-auth
        - name: strip-prefix

    - match: Host(`api.example.com`) && PathPrefix(`/orders`)
      kind: Rule
      services:
        - name: order-service
          port: 8080
      middlewares:
        - name: rate-limit
        - name: jwt-auth

---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: rate-limit
spec:
  rateLimit:
    average: 100
    burst: 200
    period: 1m

---
apiVersion: traefik.io/v1alpha1
kind: Middleware
metadata:
  name: jwt-auth
spec:
  forwardAuth:
    address: http://auth-service:8080/verify
    trustForwardHeader: true
    authResponseHeaders:
      - X-User-Id
      - X-User-Role

API Gateway vs Service Mesh

This is one of the most confusing architectural decisions. They solve overlapping but different problems:

AspectAPI GatewayService Mesh
Traffic directionNorth-south (external → internal)East-west (service → service)
Primary userExternal clients, partnersInternal microservices
AuthenticationAPI keys, JWT, OAuthmTLS (mutual TLS)
Rate limitingPer-client, per-API keyPer-service, per-endpoint
RoutingURL path → serviceService name → instances
DeploymentCentralized proxyDistributed sidecar per pod
ExamplesKong, Tyk, AWS API GWIstio, Linkerd, Consul Connect
OverheadSingle hopPer-hop sidecar (~1ms per hop)

TIP

You often need both. Use an API gateway for external traffic (authentication, rate limiting, API management) and a service mesh for internal traffic (mTLS, observability, traffic shaping between services). They are complementary, not competing.


Common Anti-Patterns

1. Business Logic in the Gateway

WRONG: Gateway computes order totals, validates business rules
RIGHT: Gateway authenticates, rate limits, routes — services own business logic

The gateway should be a thin infrastructure layer. Putting business logic there creates a coupling bottleneck and makes the gateway a deployment dependency for every team.

2. Gateway as the Only Load Balancer

The gateway should route to service endpoints, not individual instances. Let Kubernetes services, internal load balancers, or the service mesh handle instance-level load balancing.

3. Unbounded Request Buffering

yaml
# Set reasonable limits
services:
  - name: upload-service
    plugins:
      - name: request-size-limiting
        config:
          allowed_payload_size: 10  # MB
          require_content_length: true

4. No Timeout Configuration

DANGER

Default timeouts in most gateways are 60 seconds. If a backend hangs, the gateway holds the connection open, consuming resources. Always configure aggressive timeouts:

  • Connect timeout: 3-5 seconds
  • Read timeout: 10-30 seconds (adjust per endpoint)
  • Write timeout: 10-30 seconds

5. Single Gateway for All Environments

Development, staging, and production should not share a gateway. Configuration changes in staging could affect production routing.


Operational Best Practices

Health Checks

yaml
# Kong upstream health checks
upstreams:
  - name: user-service
    targets:
      - target: users-1:8080
        weight: 100
      - target: users-2:8080
        weight: 100
    healthchecks:
      active:
        http_path: /health
        healthy:
          interval: 5
          successes: 2
        unhealthy:
          interval: 5
          http_failures: 3
          timeouts: 3
      passive:
        healthy:
          successes: 5
        unhealthy:
          http_failures: 5
          timeouts: 3

Key Metrics to Monitor

MetricTargetAlert On
Request rateBaseline> 2x baseline (DDoS or traffic spike)
Error rate (5xx)< 0.1%> 1%
Latency P99< 200 ms (gateway overhead)> 500 ms
Backend healthAll healthyAny backend unhealthy
Rate limit hits< 5% of total> 20% (legitimate clients being throttled)
Auth failures< 1%> 10% (credential leak, brute force)
Connection pool utilization< 70%> 90%

Key Takeaways

  1. Start with a single gateway — add BFF pattern or two-tier only when the single gateway becomes a bottleneck for team velocity
  2. Keep the gateway thin — authentication, rate limiting, routing, and observability belong here; business logic does not
  3. Gateway vs service mesh — gateways handle north-south (external) traffic; service meshes handle east-west (internal) traffic; you often need both
  4. Kong for full API management — largest plugin ecosystem, mature, flexible
  5. Traefik for Kubernetes-native — auto-discovers services, CRD-based configuration, zero-config TLS with Let's Encrypt
  6. AWS API Gateway for serverless — no infrastructure to manage, pay per request, integrates with Lambda
  7. Always configure timeouts — default 60-second timeouts cause cascading failures under load
  8. Distributed rate limiting needs Redis — in-memory counters are inaccurate across multiple gateway instances

"What I cannot create, I do not understand." — Richard Feynman