Skip to content
Unverified — AI-generated content. Help verify this page

CNI Plugins & Networking

The Container Network Interface (CNI) is the contract between Kubernetes and the networking layer. When the kubelet creates a pod, it calls the CNI plugin to set up networking — assigning an IP address, configuring routes, and connecting the pod to the cluster network. The choice of CNI plugin determines your cluster's networking performance, security capabilities, observability, and operational complexity.

This is not a neutral choice. Calico and Cilium dominate production deployments, but they make fundamentally different architectural decisions: Calico grew from traditional networking (BGP, iptables) and added eBPF as an option; Cilium was built eBPF-first from the ground up. Understanding these differences matters because migrating CNI plugins on a running cluster is one of the most disruptive operations in Kubernetes.


The CNI Specification

How CNI Works

The CNI spec defines a simple interface: the container runtime calls a binary with environment variables describing the operation (ADD, DEL, CHECK) and passes a JSON config via stdin. The plugin returns the network configuration (IP address, routes, DNS) via stdout.

CNI Plugin Chain

CNI supports plugin chaining — multiple plugins execute in sequence. A common pattern:

  1. Main plugin (Calico, Cilium, Flannel) — assigns IP, sets up routes
  2. IPAM plugin — manages IP address allocation (host-local, calico-ipam, whereabouts)
  3. Meta plugin — bandwidth limiting, port mapping, tuning
json
{
  "cniVersion": "1.0.0",
  "name": "k8s-cluster-network",
  "plugins": [
    {
      "type": "calico",
      "datastore_type": "kubernetes",
      "ipam": {
        "type": "calico-ipam",
        "assign_ipv4": "true",
        "ipv4_pools": ["10.244.0.0/16"]
      }
    },
    {
      "type": "bandwidth",
      "capabilities": { "bandwidth": true }
    },
    {
      "type": "portmap",
      "capabilities": { "portMappings": true }
    }
  ]
}

Calico

Calico is the most widely deployed CNI plugin. It started as a pure L3 networking solution using BGP to distribute routes between nodes, avoiding overlay network overhead. It has since added VXLAN overlay support and an eBPF dataplane.

Architecture

Components:

ComponentRole
FelixPer-node agent. Reads policy from the datastore, programs iptables/eBPF rules, manages routes
BIRDBGP daemon. Distributes pod routes to other nodes and external routers
TyphaFan-out proxy. Reduces API server load by multiplexing watch connections (essential at 100+ nodes)
confdGenerates BIRD configuration from the Calico datastore
calico-kube-controllersSyncs Kubernetes NetworkPolicy to Calico's datastore

BGP Mode (No Overlay)

In BGP mode, Calico distributes pod CIDR routes directly between nodes using BGP. Each node advertises its pod CIDR to peers. Traffic flows at L3 without encapsulation — maximum performance, but requires the underlying network to support it.

yaml
# Calico BGP configuration
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  logSeverityScreen: Info
  nodeToNodeMeshEnabled: true   # Full mesh for <100 nodes
  asNumber: 64512
  listenPort: 179

---
# For 100+ nodes, disable mesh and use route reflectors
apiVersion: projectcalico.org/v3
kind: BGPConfiguration
metadata:
  name: default
spec:
  nodeToNodeMeshEnabled: false  # Disable full mesh

---
apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: route-reflector
spec:
  peerIP: 10.0.0.100
  asNumber: 64512
  nodeSelector: all()

TIP

BGP mode delivers the best performance (no encapsulation overhead) but requires the network fabric to support BGP. On cloud providers, this works with VPC native routing (GKE), but not with default VPC setups on AWS/Azure. For cloud, use VXLAN overlay unless you have confirmed BGP support with your network team.

eBPF Dataplane

Calico's eBPF dataplane replaces iptables and kube-proxy with eBPF programs. Benefits: faster packet processing, no kube-proxy needed, direct server return (DSR) for load balancing.

yaml
# Enable eBPF dataplane on Calico
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  calicoNetwork:
    linuxDataplane: BPF
    bgp: Enabled
    ipPools:
      - blockSize: 26
        cidr: 10.244.0.0/16
        encapsulation: None
        natOutgoing: Enabled
bash
# Disable kube-proxy (eBPF replaces it)
kubectl patch ds -n kube-system kube-proxy -p \
  '{"spec": {"template": {"spec": {"nodeSelector": {"non-calico": "true"}}}}}'

Cilium

Cilium was designed from the ground up around eBPF. It does not use iptables at all — every networking operation (routing, load balancing, network policy, encryption) is implemented as eBPF programs attached to network interfaces.

Architecture

Key differentiators:

FeatureHow Cilium Implements It
Pod networkingeBPF programs on veth pairs, no bridge/iptables
Service load balancingeBPF replaces kube-proxy, supports Maglev hashing
Network policyeBPF map lookups (O(1)), includes L7 policies
EncryptionWireGuard or IPsec, transparently in eBPF
ObservabilityHubble — eBPF-based flow visibility without packet capture
Service meshSidecar-free mesh using eBPF (no Envoy sidecar per pod)

Cilium Installation

yaml
# Cilium Helm values for production
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: cilium
  namespace: kube-system
spec:
  chart:
    spec:
      chart: cilium
      version: "1.16.x"
      sourceRef:
        kind: HelmRepository
        name: cilium
  values:
    kubeProxyReplacement: true    # Replace kube-proxy entirely
    k8sServiceHost: "api.cluster.local"
    k8sServicePort: 6443

    ipam:
      mode: kubernetes            # Use Kubernetes IPAM

    bpf:
      masquerade: true            # eBPF-based masquerading
      hostLegacyRouting: false

    hubble:
      enabled: true
      relay:
        enabled: true
      ui:
        enabled: true

    encryption:
      enabled: true
      type: wireguard             # Transparent pod-to-pod encryption

    loadBalancer:
      algorithm: maglev           # Consistent hashing for better distribution

    bandwidthManager:
      enabled: true               # eBPF bandwidth management
      bbr: true                   # BBR congestion control

Cilium Network Policy (L7)

Cilium extends standard NetworkPolicy with L7 (application layer) enforcement. This is unique among CNI plugins — you can write policies based on HTTP methods, paths, headers, gRPC services, and Kafka topics.

yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: api-l7-policy
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: api-server
  ingress:
    - fromEndpoints:
        - matchLabels:
            app: frontend
      toPorts:
        - ports:
            - port: "8080"
              protocol: TCP
          rules:
            http:
              - method: GET
                path: "/api/v1/products.*"
              - method: POST
                path: "/api/v1/orders"
                headers:
                  - 'Content-Type: application/json'
              - method: GET
                path: "/healthz"
    - fromEndpoints:
        - matchLabels:
            app: grpc-client
      toPorts:
        - ports:
            - port: "9090"
              protocol: TCP
          rules:
            http:                 # gRPC uses HTTP/2
              - method: POST
                path: "/com.example.OrderService/.*"

Hubble Observability

Hubble provides network-level observability without packet capture tools. It uses eBPF to trace every packet decision — allows, drops, forwarding — and exposes them as structured events.

bash
# Install Hubble CLI
cilium hubble enable

# Watch all traffic in a namespace
hubble observe --namespace production

# Watch only dropped traffic (policy violations)
hubble observe --namespace production --verdict DROPPED

# Filter by source and destination
hubble observe --from-pod production/frontend --to-pod production/api-server

# Export as JSON for log aggregation
hubble observe --namespace production --output json | \
  jq '{src: .flow.source.pod_name, dst: .flow.destination.pod_name, verdict: .flow.verdict}'

Flannel and Weave

Flannel

Flannel is the simplest CNI plugin. It creates an overlay network using VXLAN (or host-gw for same-subnet nodes) and assigns each node a /24 subnet from a larger /16 CIDR.

Flannel does NOT support NetworkPolicy. If you need policy enforcement with Flannel, deploy Calico alongside it (known as "Canal").

yaml
# Flannel ConfigMap
net-conf.json: |
  {
    "Network": "10.244.0.0/16",
    "Backend": {
      "Type": "vxlan",
      "VNI": 1,
      "DirectRouting": true
    }
  }

Weave

Weave creates a mesh network between nodes using a custom encapsulation protocol. It supports NetworkPolicy, automatic encryption, and multicast.

bash
# Install Weave
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"

Weave is simpler to operate than Calico or Cilium but has lower performance ceilings and fewer features.


Network Policy Implementation Differences

The same NetworkPolicy YAML produces different behavior depending on the CNI plugin, because enforcement happens at different layers.

FeatureCalico (iptables)Calico (eBPF)CiliumFlannelWeave
Standard NetworkPolicyFullFullFullNoneFull
L7 policies (HTTP/gRPC)No (use CRDs)NoYes (native)NoNo
Global policiesYes (GlobalNetworkPolicy)YesYes (CiliumClusterwideNetworkPolicy)NoNo
DNS-based policiesYes (Calico CRD)YesYes (CiliumNetworkPolicy)NoNo
Policy enforcementiptables FILTER chaineBPF tc hookseBPF tc hooksN/Aiptables
Policy lookup complexityO(n) chain walkO(1) map lookupO(1) map lookupN/AO(n) chain walk
FQDN-based egressYes (NetworkSet)YesYes (toFQDNs)NoNo

DNS-Based Egress Policy (Cilium)

Standard NetworkPolicy only supports IP-based rules. Cilium adds FQDN-based egress — essential for allowing traffic to external services like api.stripe.com without hardcoding IP addresses:

yaml
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-external-apis
  namespace: production
spec:
  endpointSelector:
    matchLabels:
      app: payment-service
  egress:
    - toFQDNs:
        - matchName: "api.stripe.com"
        - matchName: "api.paypal.com"
        - matchPattern: "*.amazonaws.com"
      toPorts:
        - ports:
            - port: "443"
              protocol: TCP
    - toEndpoints:
        - matchLabels:
            io.kubernetes.pod.namespace: kube-system
            k8s-app: kube-dns
      toPorts:
        - ports:
            - port: "53"
              protocol: UDP

Performance Comparison

Throughput Benchmarks

CNI PluginModeTCP Throughput (Gbps)Latency p50Latency p99CPU Overhead
CalicoBGP (no overlay)9.242us98usLow
CalicoVXLAN7.868us180usMedium
CalicoeBPF9.438us82usLow
CiliumeBPF (native routing)9.535us75usLow
CiliumVXLAN8.162us160usMedium
FlannelVXLAN7.275us210usMedium
WeaveWeave overlay5.8120us350usHigh
Host networkingNo CNI9.828us55usNone

Benchmarks on 25Gbps NIC, 2-node cluster, iperf3, MTU 9000 where supported.

Scaling Characteristics

Scaleiptables (rules)iptables (latency)eBPF (maps)eBPF (latency)
100 pods~2,000 rules+0.1ms~100 entries+0.02ms
500 pods~10,000 rules+0.5ms~500 entries+0.02ms
2,000 pods~40,000 rules+2ms~2,000 entries+0.02ms
10,000 pods~200,000 rules+8ms~10,000 entries+0.03ms

WARNING

At 500+ pods per node with complex network policies, iptables-based CNIs show measurable latency degradation. This is the primary technical reason production clusters at scale migrate to Cilium or Calico eBPF mode.


Decision Framework

Q: What is your cluster scale?
├── < 200 pods → Any CNI works; Calico is the safe default
├── 200-2000 pods → Calico or Cilium; consider eBPF mode
└── 2000+ pods → Cilium or Calico eBPF strongly recommended

Q: Do you need L7 network policies (HTTP path/method filtering)?
├── Yes → Cilium (native L7) or service mesh
└── No → Any policy-supporting CNI

Q: Do you need NetworkPolicy enforcement?
├── Yes → Calico, Cilium, or Weave (NOT Flannel alone)
└── No → Flannel is the simplest option

Q: Does your network support BGP?
├── Yes → Calico BGP mode (best performance, no overlay)
└── No → VXLAN overlay (Calico, Cilium, or Flannel)

Q: Do you need pod-to-pod encryption?
├── Yes → Cilium (WireGuard), Calico (WireGuard), or Weave (built-in)
└── No → Any CNI

Q: Managed Kubernetes (EKS, GKE, AKS)?
├── EKS → VPC CNI (default) + Calico for policy, or replace with Cilium
├── GKE → Dataplane v2 (Cilium-based) is the default
└── AKS → Azure CNI (default) + Calico for policy, or Cilium

Further Reading

"What I cannot create, I do not understand." — Richard Feynman