Console Login

Kubernetes Networking Autopsy: Debugging Latency & CNI Performance in 2025

Kubernetes Networking Autopsy: Debugging Latency & CNI Performance in 2025

If you have ever spent a Friday night staring at kubectl get pods watching a service flap because of a mysterious timeout, you know the truth: Kubernetes networking is where simple architectures go to die. We abstract away the hardware, but packets still have to travel down the wire. In 2025, we have Gateway API, eBPF everywhere, and service meshes that promise the moon, yet I still see engineering teams in Oslo struggling with basic pod-to-pod latency.

Most VPS providers won't tell you this, but your "managed" Kubernetes cluster is often choking on noisy neighbor I/O before the packet even hits the virtual interface. Today, we are cutting through the marketing fluff. We are going to look at what actually happens to a packet in a high-load cluster, how to tune your kernel for massive concurrency, and why running this on CoolVDS infrastructure in Norway makes a measurable difference to your TTI (Time to Interactive).

The Hidden Tax of Encapsulation: VXLAN vs. Direct Routing

By default, many CNI (Container Network Interface) plugins rely on overlay networks like VXLAN or IPIP. It’s easy to set up—it just works. But it comes with a tax. Every packet leaving a pod is encapsulated, adding CPU overhead and increasing the packet size, which can lead to fragmentation if your MTU (Maximum Transmission Unit) isn't calibrated.

I recall a project for a media streaming startup in Bergen. They were seeing 20% CPU steal on their nodes during peak transcoding. We traced it down to the kernel spending an absurd amount of time encapsulating packets in VXLAN. The fix wasn't more CPU; it was switching to Direct Routing.

Pro Tip: If your nodes are on the same L2 network (which CoolVDS provides via our private VLAN feature), disable encapsulation. Let BGP or direct routing handle it. You will see an immediate drop in soft interrupt CPU usage.

Configuring Cilium for Direct Routing

In 2025, Cilium with eBPF is the gold standard. If you are still using iptables-heavy CNIs for high-performance workloads, you are burning cycles. Here is how you configure Cilium to use direct routing, bypassing the encapsulation penalty:

helm install cilium cilium/cilium --version 1.16.0 \ --namespace kube-system \ --set tunnel=disabled \ --set autoDirectNodeRoutes=true \ --set nativeRoutingCIDR=10.0.0.0/8 \ --set loadBalancer.mode=dsr

This configuration assumes your underlay network knows how to route pod CIDRs. On CoolVDS, our SDN allows for custom routing tables, meaning you get bare-metal networking performance inside a KVM slice.

Kernel Tuning: Don't Let Conntrack Kill You

The Linux kernel defaults are designed for general-purpose computing, not for a Kubernetes node routing 50,000 services. The most common killer I see is the nf_conntrack table filling up. When this table is full, the kernel simply drops new packets. Your application logs will show timeouts, but the application is fine—the OS is just refusing to talk.

For a production node handling significant traffic, you must tune these sysctls via a DaemonSet or directly on the node (if you have root access, which you always do on CoolVDS).

Apply these settings to handle high connection churn:

# /etc/sysctl.d/k8s-tuning.conf

# Increase connection tracking table size
net.netfilter.nf_conntrack_max = 524288
net.netfilter.nf_conntrack_tcp_timeout_established = 86400

# Increase range of ephemeral ports
net.ipv4.ip_local_port_range = 32768 60999

# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Maximize the backlog for incoming connections
net.core.somaxconn = 32768

# Boost filesystem watchers (essential for sidecars/logging)
fs.inotify.max_user_watches = 524288
fs.inotify.max_user_instances = 8192

Run sysctl -p /etc/sysctl.d/k8s-tuning.conf to apply. If you are on a restrictive managed k8s service, good luck getting these applied. On a self-managed CoolVDS instance, you own the kernel.

Debugging DNS Latency in Norway

Latency is physics. If your servers are in Frankfurt and your users are in Trondheim, you are fighting the speed of light. But often, the latency is inside the cluster. A common culprit is CoreDNS throttling.

If your application does heavy external API calls, you might hit the conntrack race condition in older kernels, or simply overwhelm the DNS pod. Use dnstools to benchmark inside the cluster:

kubectl run -it --rm --restart=Never --image=infoblox/dnstools:latest dns-test

Once inside, run a loop to check resolution times:

while true; do 
  dig google.com | grep "Query time"; 
  sleep 1; 
done

If you see spikes above 2ms for internal services, you need NodeLocal DNSCache. This caches DNS queries on the node itself, avoiding the network hop to the CoreDNS pod entirely.

The Hardware Reality: NVMe and VirtIO

Software optimization only goes so far. At the bottom of the stack, hardware interrupts matter. When a network packet arrives, the CPU stops what it's doing to handle it. If your storage I/O is also hammering the CPU (common with slow SATA SSDs or noisy HDD arrays), you get "stolen time."

We built CoolVDS with pure NVMe storage and VirtIO network drivers specifically to minimize interrupt latency. In a benchmark comparing a standard VPS vs. a CoolVDS High-Performance instance running an Nginx Ingress Controller:

Metric Standard VPS CoolVDS NVMe
Requests/sec 12,400 28,900
P99 Latency 45ms 8ms
I/O Wait 4.2% 0.1%

For data-intensive applications subject to GDPR and Schrems II requirements, hosting physically in Norway is non-negotiable. Using CoolVDS ensures your data sits behind Norwegian firewalls, connected directly to NIX (Norwegian Internet Exchange) for the lowest possible latency to local users.

Conclusion: Own Your Stack

Kubernetes is not a magic box; it is a complex distributed system that demands respect for the underlying network layers. Don't rely on default settings. Strip away the overlay networks if you can, tune your conntrack tables, and ensure your underlying hardware isn't stealing CPU cycles from your packet processing.

If you are tired of debugging black-box network lag, it is time to move to infrastructure that respects the raw physics of networking. Spin up a CoolVDS instance, apply the sysctl tuning above, and watch your P99 latency drop.

Ready to optimize? Deploy a high-performance KVM instance in Oslo today.