Console Login

Nordic Latency Killers: Advanced API Gateway Tuning for High-Throughput Systems

Nordic Latency Killers: Advanced API Gateway Tuning for High-Throughput Systems

If your API Gateway adds more than 5ms to a request, you are doing it wrong. I don't care if you are routing traffic from Oslo to a database in Frankfurt or handling strictly local transactions within the NIX (Norwegian Internet Exchange) ecosystem; the gateway should be invisible.

Most VPS Norway providers sell you on vCPUs and RAM, but they conveniently ignore the underlying steal time that kills network I/O. When you are processing 50,000 requests per second (RPS), a 2% CPU steal caused by a noisy neighbor isn't a statistic—it's downtime.

We are going to tune an NGINX-based gateway (like Kong or vanilla NGINX) for raw performance. No fluff. Just the configs that actually work in 2025.

The "War Story": When 100ms Became a Crisis

I recently audited a fintech setup compliant with PSD2 regulations. They were hosting on a generic "hyperscaler" node in Stockholm, serving users in Bergen. Latency was averaging 120ms. For a payment API, that's sluggish. For high-frequency trading data, it's obsolete.

The culprit wasn't their Go application code. It was their gateway configuration. They were opening a new TCP connection to their upstream services for every single request. The TLS handshake overhead alone was eating 40ms per call. By the time we moved them to a dedicated KVM slice and enabled upstream keepalives, latency dropped to 18ms. Here is exactly how we did it.

1. Kernel Tuning: The Foundation

Before touching NGINX, you must fix the Linux networking stack. By default, Linux is tuned to save RAM, not to handle C10K problems. On a CoolVDS NVMe instance, you have the I/O throughput to push these limits.

Add these to /etc/sysctl.conf. These settings optimize the TCP stack for short-lived connections common in API traffic.

# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1

# Increase the maximum number of open files (essential for high concurrency)
fs.file-max = 2097152

# Maximize the backlog of incoming connections
net.core.somaxconn = 65535
net.ipv4.tcp_max_syn_backlog = 65535

# Expand the local port range to avoid exhaustion
net.ipv4.ip_local_port_range = 1024 65535

# Use BBR congestion control for better throughput over the public internet
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbr
Pro Tip: Always run sysctl -p after changes. If you are on a provider that restricts kernel tuning (like many container-based hosts), move. You need a KVM-based environment like CoolVDS where you have full root control over the kernel.

2. NGINX / Kong Configuration

Whether you are using raw NGINX or Kong Gateway, the core directives are the same. The biggest performance killer is lack of keepalives to the upstream (your backend services).

Upstream Keepalive

Without this, your gateway performs a full TCP handshake + SSL handshake with your backend service for every incoming request. It wastes CPU cycles and adds massive latency.

upstream backend_service {
    server 10.0.0.5:8080;

    # The critical directive. Keeps 64 idle connections open to the backend.
    keepalive 64;
}

Inside your location block, you must ensure the HTTP version is 1.1 and the connection header is cleared, or the keepalive won't work:

location /api/ {
    proxy_pass http://backend_service;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
}

Worker and File Descriptors

Ensure NGINX can utilize all cores and open enough files. On a CoolVDS High Performance plan with 8 vCPUs, manual pinning is rarely needed, but auto is mandatory.

worker_processes auto;

# Must exceed worker_connections
worker_rlimit_nofile 65535;

events {
    # Determines how many clients a single worker can handle
    worker_connections 16384;
    use epoll;
    multi_accept on;
}

3. SSL/TLS Termination offloading

Decryption is expensive. In 2025, using ChaCha20-Poly1305 is often faster on modern CPUs that lack specific AES-NI instructions, though CoolVDS processors support AES-NI natively. Prioritize curves that are computationally cheaper without sacrificing security.

ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305;
ssl_prefer_server_ciphers on;

# Cache SSL sessions to avoid full handshakes for returning clients
ssl_session_cache shared:SSL:50m;
ssl_session_timeout 1d;

4. The Logging Trap

I have seen gateways crash simply because disk I/O couldn't keep up with access_log writes. Synchronous logging is a death sentence for high-throughput APIs.

If you need logs for compliance (Datatilsynet requirements often mandate audit trails), use a buffer. This writes to the disk in chunks rather than per request.

access_log /var/log/nginx/access.log main buffer=64k flush=5s;

Better yet, on CoolVDS NVMe storage, the write latency is negligible compared to spinning rust (HDD), but buffering is still best practice to save CPU context switches.

Why Infrastructure Choice Dictates Performance

You can apply every config above, but if your host oversells their CPU, your p99 latency will be garbage. In the Nordic market, where internet speeds are among the fastest in the world, users notice sluggishness immediately.

We built the CoolVDS architecture specifically to address the "noisy neighbor" problem in API hosting:

  • KVM Virtualization: Strict isolation of resources. Your kernel tuning actually works.
  • Local Peering: Our routes to major Norwegian ISPs minimize hops. Low latency is physical, not just software.
  • NVMe Arrays: When your API gateway needs to cache large responses or write heavy logs, IOPS (Input/Output Operations Per Second) matter.

Final Thoughts

Optimization is an iterative process. Start with the kernel, secure your SSL handshake, and ensure your upstream connections are persistent.

Do not let your infrastructure be the bottleneck for your code. If you are serious about low-latency API delivery in Northern Europe, stop sharing CPU cycles with amateur blogs.

Ready to drop your latency? Deploy a high-performance KVM instance on CoolVDS in under 55 seconds and test these configs yourself.