Console Login

Kubernetes Networking Deep Dive: Why Your Packets Are Dropping in the Overlay

Kubernetes Networking: Debugging the Invisible Layer

It’s 03:00 AM. Your Grafana dashboard is screaming red. The API latency just spiked from 25ms to 400ms, but your CPU usage is sitting idly at 12%. The pods are running, the readiness probes are passing, but the traffic isn't flowing.

Welcome to the seventh circle of hell: Kubernetes Networking.

Most developers treat K8s networking as a black box. You define a Service, throw an Ingress on top, and expect magic. But in production, magic doesn't exist. Physics exists. And in the distributed systems we run today, the network is usually where the bodies are buried. I’ve spent the last decade debugging packet drops across Nordic datacenters, and I can tell you: if you don't understand what happens between eth0 on your node and eth0 inside your container, you are flying blind.

The Overlay Tax: Encapsulation vs. Reality

When you deploy a cluster, you pick a CNI (Container Network Interface). If you picked the default implementation without reading the docs, you're likely running an overlay network using VXLAN. This wraps your Layer 2 frames inside Layer 4 UDP packets to transport them across the host network.

This adds overhead. We call it the "Overlay Tax."

In 2025, modern CPUs handle encapsulation reasonably well, but the real enemy isn't CPU cycles; it's MTU (Maximum Transmission Unit). I debugged a massive outage for a retail client in Oslo last month. Their backend services were timing out randomly.

The culprit? Their underlay network (the physical VPS interface) had an MTU of 1500. Their overlay network (VXLAN) also tried to push 1500 bytes. Add 50 bytes of VXLAN headers, and the packet exceeds the physical limit. The result? Fragmentation. Or worse, silent drops if the 'Don't Fragment' (DF) bit is set.

Check Your MTU Now

Don't guess. Check. SSH into one of your worker nodes and run:

ip -d link show flannel.1
# or for Calico
ip -d link show vxlan.calico

If your VDS provider supports Jumbo Frames (MTU 9000), use them. If you are on standard commodity cloud, you need to lower your CNI MTU to allow room for headers.

Pro Tip: At CoolVDS, our KVM infrastructure is tuned for high-performance networking. We pass the host's network capabilities directly to the guest where possible, minimizing the "noisy neighbor" effect on packet processing. When you control the vertical stack, you stop fighting the hypervisor.

eBPF: The Standard for 2025

If you are still relying on iptables for service routing (the old kube-proxy mode), stop. Iptables is a linked list. O(n) complexity. When you have 5,000 services, every packet has to traverse a massive list of rules to find its destination. It kills latency.

By now, Cilium utilizing eBPF (Extended Berkeley Packet Filter) is the gold standard. It bypasses the kernel's sluggish TCP/IP stack for internal routing.

Here is a battle-tested Cilium Helm configuration we use for high-throughput clusters:

helm install cilium cilium/cilium --version 1.16.0 \
  --namespace kube-system \
  --set kubeProxyReplacement=true \
  --set bpf.masquerade=true \
  --set bandwidthManager.enabled=true \
  --set k8sServiceHost=API_SERVER_IP \
  --set k8sServicePort=6443

Replacing kube-proxy with Cilium's eBPF implementation often reduces service-to-service latency by 30-40%. It’s not an optimization; it’s a requirement for modern microservices.

The "NIX" Factor: Latency Matters

Let's talk geography. If your target market is Norway, why are your servers in Frankfurt? Speed of light is constant. The round-trip time (RTT) from Oslo to Frankfurt is ~25-30ms. From Oslo to a datacenter connected to NIX (Norwegian Internet Exchange)? Sub-5ms.

For a stateless API, 20ms might not matter. But for a database-heavy application or a real-time trading platform, that latency compounds on every query. If your app makes 10 sequential DB calls, you just added 200ms of pure waiting time.

Benchmarking Network IO

Don't trust the marketing on the "Pricing" page. Run iperf3 between your nodes. We see massive variance in budget VPS providers because they oversubscribe their 10Gbps uplinks.

# On Node A (Server)
iperf3 -s

# On Node B (Client)
iperf3 -c [Node_A_IP] -t 30 -P 4

If you aren't getting consistent throughput, your provider is throttling you. This is why we stick to KVM virtualization at CoolVDS—resources are strictly isolated.

Debugging: When `kubectl` Isn't Enough

Sometimes you need to get dirty. When a pod can't resolve DNS, it's usually a CoreDNS issue or a UDP conntrack race condition.

Deploy a "netshoot" pod to debug inside the cluster network namespace:

apiVersion: v1
kind: Pod
metadata:
  name: netshoot
  namespace: default
spec:
  containers:
  - name: netshoot
    image: nicolaka/netshoot
    command: ["/bin/bash"]
    tty: true
    securityContext:
      capabilities:
        add:
        - NET_ADMIN

Once inside, use tcpdump to see if the SYN packets are actually leaving the pod.

tcpdump -i eth0 port 53 -n

If you see requests leaving but no answers returning, check the Security Groups or Firewall rules on your node. If you see no requests, check your /etc/resolv.conf and the ndots configuration.

Ingress and the Gateway API

The Ingress API has served us well, but the Gateway API is the mature successor we needed. It decouples the role of the infrastructure provider (CoolVDS) from the developer. It allows us to attach policies like timeouts and retries at a granular level without messy annotations.

Here is a basic HTTPRoute example that splits traffic—something impossible with standard Ingress without custom NGINX snippets:

apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: traffic-split
spec:
  parentRefs:
  - name: my-gateway
  rules:
  - backendRefs:
    - name: v1-service
      port: 80
      weight: 90
    - name: v2-service
      port: 80
      weight: 10

Data Sovereignty and Compliance

We cannot ignore the legal layer. Since Schrems II, moving personal data outside the EEA is a legal minefield. Using a US-based cloud provider often entails transfer mechanisms that Datatilsynet (The Norwegian Data Protection Authority) scrutinizes heavily.

Running your Kubernetes cluster on Norwegian soil, with a provider that guarantees data residency, isn't just about latency—it's about compliance. It simplifies your GDPR documentation massively when you can point to a rack in Oslo rather than a nebulous "Region: eu-north-1".

Final Thoughts

Kubernetes networking is unforgiving. It exposes every weakness in the underlying infrastructure. A cheap VPS with noisy neighbors, unstable interrupts, or throttled I/O will cause random 502 errors that no amount of YAML editing can fix.

You need a solid foundation. You need raw compute that respects the packets you send.

Stop debugging phantom latency. Spin up a CoolVDS NVMe instance today, install Cilium, and watch your p99 latency drop to the floor. The network shouldn't be a mystery—it should be a pipeline.