Console Login

Kubernetes Networking is Broken: A Deep Dive into CNI, eBPF, and Latency in 2025

Kubernetes Networking is Broken: A Deep Dive into CNI, eBPF, and Latency in 2025

Abstraction is a lie. Especially in Kubernetes networking. You define a Service, creating a virtual IP that doesn't exist on any interface, and expect magic. But when the magic fails—when packets drop between nodes or latency spikes unpredictably—you aren't dealing with YAML anymore. You are dealing with the Linux kernel, IP tables, and the physical reality of cables.

I have spent the last three weeks debugging a cluster that was randomly terminating connections. The culprit wasn't the application code. It was a default CNI configuration clashing with the underlying MTU size of the virtual switch. If you are running Kubernetes in production in 2025, you cannot afford to treat the network as a black box.

The CNI Battlefield: Why eBPF Won

Back in 2020, we argued about Flannel vs. Calico. That war is over. For high-performance clusters, especially those dealing with the strict data locality requirements we see here in Norway (thanks, Datatilsynet), Cilium with eBPF is the standard. Legacy iptables-based routing simply cannot handle the churn of ephemeral containers without eating up your CPU.

eBPF allows us to run sandboxed programs in the kernel context. Instead of a packet traversing the entire TCP/IP stack just to be forwarded, we can short-circuit it. This drops latency significantly.

Here is how you verify if your Cilium agent is actually utilizing eBPF routing or falling back to legacy modes:

kubectl -n kube-system exec -ti ds/cilium -- cilium status --verbose

You are looking for KubeProxyReplacement: True. If it says False, you are still burning CPU cycles on iptables rule management.

Configuring Cilium for Maximum Throughput

When deploying on CoolVDS, where we give you direct access to high-performance NVMe and unthrottled network interfaces, you want to ensure your CNI isn't the bottleneck. Use these Helm values to enable strict replacement of kube-proxy:

kubeProxyReplacement: "true"
k8sServiceHost: "API_SERVER_IP"
k8sServicePort: "6443"
l7Proxy: false # Only enable if you absolutely need L7 visibility, it adds latency

The MTU Trap: A War Story

I recently audited a setup for a logistics company in Oslo. They were experiencing random timeouts connecting to their PostgreSQL database. The database was fine. The network was fine. The problem was fragmentation.

The underlying network interface had an MTU of 1500. Their overlay network (VXLAN) added 50 bytes of headers. The inner payload was trying to push 1500 bytes. The result? Fragmentation. Some firewalls silently drop fragments. The connection hangs.

Pro Tip: Always set your CNI MTU 50-100 bytes lower than the host interface MTU unless you are using direct routing. On CoolVDS, we support Jumbo Frames (MTU 9000) on private networks. Use them. It reduces CPU overhead by 6x for large data transfers.

Check your current pod MTU:

kubectl exec -it  -- ip link show eth0

Gateway API: The Ingress Killer

The Ingress resource was ambiguous. In 2025, the Gateway API has matured to GA (General Availability). It splits the role of the infrastructure provider (CoolVDS) from the application developer. It allows us to define clearer routing rules.

Here is a robust HTTPRoute example that splits traffic between two versions of an app—essential for canary deployments without external tools:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: checkout-split
spec:
  parentRefs:
  - name: main-gateway
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /checkout
    backendRefs:
    - name: checkout-v1
      port: 8080
      weight: 90
    - name: checkout-v2
      port: 8080
      weight: 10

DNS: The Silent Performance Killer

Every time your code calls a database by hostname, it hits CoreDNS. In a cluster with 5,000 requests per second, a default CoreDNS config is a bottleneck. We often see DNS latency accounting for 30% of total request time in unoptimized clusters.

Optimize your Corefile. Force TCP for reliability if you see UDP drops, and use the autopath plugin to reduce query volume.

.:53 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       fallthrough in-addr.arpa ip6.arpa
       ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf {
       max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
}

The CoolVDS Advantage: Peering and Latency

You can tune Kubernetes all day, but if the underlying metal is garbage, your latency will be garbage. In Norway, physics matters. If your data center routes traffic through Frankfurt to get from Bergen to Oslo, you are adding 30ms of unnecessary lag.

CoolVDS infrastructure is peered directly at NIX (Norwegian Internet Exchange). When you run a Kubernetes node on our VDS, your packets take the shortest path. We don't oversubscribe our network uplinks. If you pay for 1Gbps, you get 1Gbps.

Benchmarking Network Throughput

Don't take my word for it. Run iperf3 between two pods on different nodes. On a properly configured CoolVDS instance using the config above, you should see near-line rate speeds with minimal jitter.

# Server Pod
iperf3 -s

# Client Pod
iperf3 -c  -t 30 -P 4

Security and Compliance (GDPR)

With the strict interpretation of GDPR and Schrems II, ensuring traffic doesn't accidentally leave the EEA is critical. NetworkPolicies are your firewall. Deny everything by default.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Then, allow-list only what is necessary. This isn't just security; it's compliance documentation. When an auditor asks how you ensure service isolation, you show them the YAML.

Conclusion

Kubernetes networking in 2025 is powerful but unforgiving. You need to understand the stack from the CNI eBPF hooks down to the physical switch ports. Don't let default configurations kill your performance.

If you are tired of debugging network flakes on oversold hardware, it is time to move. Deploy a test instance on CoolVDS today. Experience what low-latency, NVMe-backed Kubernetes actually feels like.