Console Login

Kubernetes Networking Deep Dive: Surviving the Packet Jungle in Production (2023 Edition)

Kubernetes Networking Deep Dive: Surviving the Packet Jungle in Production

If I had a krone for every time a "microservices architecture" failed because someone didn't understand how packets move between Pods, I'd have bought a private island in the Oslofjord by now. Kubernetes networking is deceptively simple on paper: every Pod gets an IP, and they can all talk to each other. In reality, it is a complex beast of encapsulation, iptables rules, and latency penalties.

I've spent the last month debugging a cluster for a fintech client in Oslo. Their latency was spiking randomly. The culprit wasn't their Go code; it was a saturated nf_conntrack table and a noisy neighbor on their public cloud provider stealing CPU cycles needed for packet encapsulation. Networking isn't just software; it's bound by the iron laws of hardware.

This is a technical deep dive. We are looking at the state of Kubernetes networking as of January 2023, specifically for high-performance deployments in the Nordic region.

The CNI Battlefield: Calico vs. Cilium

In 2023, the choice for Container Network Interface (CNI) largely boils down to two philosophies: standard iptables (Calico) or eBPF (Cilium). For years, Calico was the default. It uses BGP to route packets and acts like a standard router. It is reliable. But reliability isn't enough when you are pushing gigabits of traffic.

We are seeing a massive shift toward Cilium this year. Why? Because iptables is linear. If you have 5,000 services, the kernel has to traverse a massive list of rules for every packet. eBPF (Extended Berkeley Packet Filter) allows us to run sandboxed programs in the kernel, bypassing much of the TCP/IP stack overhead.

Pro Tip: If you are running on kernels older than 5.7, stick to Calico with IPVS mode enabled. If you are on modern kernels (like the ones we maintain on CoolVDS KVM images), Cilium is the performance winner.

Configuring IPVS Mode for Calico

If you must use Calico, do not use the default iptables mode for Service proxying. Switch to IPVS (IP Virtual Server). It uses hash tables instead of linear lists.

# Inside your Calico ConfigMap or manifest
kind: FelixConfiguration
apiVersion: crd.projectcalico.org/v1
metadata:
  name: default
spec:
  bpfEnabled: false
  kubeProxyReplacement: "Strict"
  # Enable IPVS
  serviceProxyMode: "ipvs"

The DNS Black Hole

"It's always DNS." In Kubernetes, this meme is a painful reality. By default, CoreDNS is often under-provisioned. I recently audited a setup where a single CoreDNS replica was handling queries for 500 nodes. The latency to resolve database.svc.cluster.local was over 200ms.

To fix this, you need to force CoreDNS to use TCP for upstream queries if reliability is paramount, and you absolutely must tune the ndots configuration in your application deployments.

Optimizing CoreDNS ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
           lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
           pods insecure
           fallthrough in-addr.arpa ip6.arpa
           ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
           max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }

Notice the max_concurrent? Bump that up. The default is often too low for high-traffic microservices.

Ingress & The "Norwegian" Context

When you expose services to the internet, you are dealing with Ingress Controllers. NGINX is still the king here in early 2023. However, the default NGINX configuration is not tuned for modern web sockets or long-lived connections often used in financial dashboards.

Furthermore, if your target audience is in Norway, network placement matters. Routing traffic through Frankfurt or London to reach a user in Bergen adds unnecessary milliseconds. You want your nodes peering directly at NIX (Norwegian Internet Exchange).

Tuning NGINX for High Throughput

You need to inject these snippets into your NGINX Ingress Controller ConfigMap to handle high loads without dropping connections.

apiVersion: v1
kind: ConfigMap
metadata:
  name: ingress-nginx-controller
  namespace: ingress-nginx
data:
  # Increase worker connections
  max-worker-connections: "65536"
  # Enable multi-accept to grab multiple connections at once
  enable-multi-accept: "true"
  # Optimize keepalive
  keep-alive: "120"
  upstream-keepalive-connections: "100"
  # Buffer tuning for larger payloads
  client-body-buffer-size: "100m"
  proxy-body-size: "100m"

The Hardware Reality: Why Virtualization Matters

Here is the uncomfortable truth: You can optimize your CNI and DNS all day, but if the underlying hypervisor is stealing your CPU cycles, your network performance will tank. Network packet processing in Kubernetes (especially with encapsulation protocols like VXLAN) is CPU intensive.

On oversold shared hosting, "Steal Time" (st) in top is your enemy. If your neighbors are mining crypto, your UDP packets get dropped. This is why for production Kubernetes, we only use KVM virtualization at CoolVDS. It provides strict isolation. We don't overprovision CPU cores for our premium tiers.

Feature Container-based VPS (LXC/OpenVZ) CoolVDS KVM (Hardware Virtualization)
Kernel Access Shared with host (Dangerous for K8s) Dedicated Kernel (Essential for eBPF)
Network Isolation Weak Strong (Tap interfaces)
Latency Consistency Variable (Noisy neighbors) Predictable
Custom Modules Restricted Allowed (Load ip_vs, br_netfilter)

Security and Compliance (GDPR & Schrems II)

In Norway, following the Datatilsynet guidelines is not optional. Since the Schrems II ruling, relying on US-owned cloud providers has become a legal minefield for sensitive data. Running your Kubernetes cluster on sovereign hardware within Norway simplifies compliance massively.

You must enforce network policies. By default, K8s allows all traffic. This is a security nightmare. Lock it down. Here is a baseline NetworkPolicy that denies all ingress traffic to a namespace unless explicitly allowed.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-ingress
  namespace: production
spec:
  podSelector: {}
  policyTypes:
  - Ingress

Apply this, and then whitelist only what is necessary. It’s tedious, but it prevents lateral movement during a breach.

Final Thoughts: Don't Skimp on the Foundation

Kubernetes networking is where the abstraction leaks. You need to understand the path of the packet. You need to choose the right CNI (Cilium for performance, Calico for compatibility), tune your CoreDNS, and crucially, run it on infrastructure that doesn't fight against you.

Latency issues are rarely magic; they are usually resource contention or misconfiguration. If you are tired of debugging network lag caused by noisy neighbors, it might be time to move your cluster to a platform designed for raw I/O.

Ready to lower your latency? Deploy a high-performance KVM instance on CoolVDS in Oslo today and give your packets the speed they deserve.