Stop Blaming Your Backend: It's Your Gateway Config
I recently audited a fintech setup in Oslo. Their architecture was sound—microservices in Go, Redis caching, the works. Yet, during peak trading hours, latency spiked unpredictably. They blamed the database. They blamed the network provider. They were wrong.
The culprit? Default Linux kernel settings and an unoptimized NGINX ingress controller. When you are pushing 10,000 requests per second (RPS), standard configurations fall apart. The TCP stack gets clogged with connections in TIME_WAIT, and file descriptors run out faster than you can restart the service. If you are serving traffic to Nordic users, where expectations for speed are set by the ultra-low latency of NIX (Norwegian Internet Exchange), a 200ms delay is an eternity.
This guide isn't about generic "best practices." It is about the specific, aggressive tuning required to handle high-concurrency API traffic on November 18, 2024. We will focus on NGINX (as the standard gateway engine) running on Ubuntu 24.04 LTS.
1. The Kernel: Open Your File Descriptors
Before touching the web server, you must fix the OS limitations. By default, Linux is polite. It assumes you are running a desktop, not a high-performance gateway. The biggest bottleneck is the open file limit. In Linux, everything is a file—including a TCP connection. The default limit of 1024 is laughable for an API gateway.
Check your current limits:
ulimit -nIf it says 1024, you are throttled. Here is how to fix it permanently in /etc/security/limits.conf:
# /etc/security/limits.conf
* soft nofile 1000000
* hard nofile 1000000
root soft nofile 1000000
root hard nofile 1000000Don't forget to enable these changes in PAM and verify them after a relogin. Next, we need to tune the system-wide limits in sysctl.conf. We aren't just increasing limits; we are changing how the kernel handles TCP packet queuing.
Pro Tip: On virtualized hardware, ensure your provider isn't enforcing a secondary limit at the hypervisor level. This is a common issue with OpenVZ or LXC containers. We use KVM at CoolVDS specifically to ensure that when you set a kernel limit, it's actually respected.
Apply the following network stack hardening:
# /etc/sysctl.conf
# Maximize the number of open files system-wide
fs.file-max = 2097152
# Increase the backlog for incoming connections
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Reuse connections in TIME_WAIT state (Safe for 2024 kernels)
net.ipv4.tcp_tw_reuse = 1
# Increase ephemeral port range to allow more outbound connections
net.ipv4.ip_local_port_range = 1024 65535
# Protect against SYN flood attacks while maintaining throughput
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_syn_retries = 2
# BBR Congestion Control (Standard for high throughput in 2024)
net.core.default_qdisc = fq
net.ipv4.tcp_congestion_control = bbrApply these with sysctl -p.
2. NGINX Configuration: The Engine Tuning
Now that the road is paved, let's tune the engine. The default nginx.conf is designed for compatibility, not raw I/O. For an API Gateway, we care about keepalives and worker efficiency. SSL handshakes are expensive; you want to minimize them.
Here is a production-ready block for the main and events context:
worker_processes auto;
# Match this to your system file limits
worker_rlimit_nofile 1000000;
events {
worker_connections 65535;
use epoll;
multi_accept on;
}The multi_accept on directive tells a worker to accept all new connections at once, rather than one by one. This is crucial during traffic bursts.
Upstream Keepalives
This is where 90% of configurations fail. NGINX defaults to HTTP/1.0 for upstream connections (talking to your backend microservices) and closes the connection after every request. This forces your backend to open a new TCP handshake for every single API call. It kills performance.
Configure your upstream block to use keepalives:
upstream backend_api {
server 10.0.0.5:8080;
server 10.0.0.6:8080;
# Keep 512 idle connections open to the backend
keepalive 512;
}
server {
location /api/ {
proxy_pass http://backend_api;
# Required for keepalive to work
proxy_http_version 1.1;
proxy_set_header Connection "";
}
}3. The Hardware Reality: NVMe and Neighbors
You can apply every config tweak in the world, but software cannot fix bad hardware. In an API Gateway context, I/O Wait is the enemy. Even if you aren't writing logs to disk (and you shouldn't be, ship them asynchronously to an ELK stack), the OS performs constant paging and read operations.
On budget VPS providers, you often deal with "noisy neighbors"—other users on the same physical host maximizing the disk I/O, causing your API to hang for milliseconds while waiting for the disk controller. For an e-commerce site checkout or a financial transaction, those milliseconds stack up.
This is why architecture matters.
| Feature | Standard VPS | CoolVDS Architecture |
|---|---|---|
| Storage | SATA SSD (Shared) | NVMe (Pass-through) |
| Virtualization | Container (LXC/OpenVZ) | KVM (Kernel-based VM) |
| Latency to NIX | 10-25ms | < 2ms |
We specifically engineered CoolVDS instances with local NVMe storage rather than network-attached storage (SAN). Network storage adds latency. Local NVMe is instant. When your NGINX cache sits on local NVMe, your cache HIT ratios translate to sub-millisecond response times.
4. Security Without the Latency Tax
Security usually comes at the cost of speed. In 2024, that trade-off is minimized if you use the right protocols. We are talking about TLS 1.3. It reduces the handshake overhead by one full round-trip compared to TLS 1.2.
Ensure your SSL configuration looks like this:
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers on;
# OCSP Stapling: NGINX serves the revocation status, saving the client a lookup
ssl_stapling on;
ssl_stapling_verify on;
resolver 1.1.1.1 8.8.8.8 valid=300s;
resolver_timeout 5s;5. The Nordic Context: Data Sovereignty
Performance isn't just about speed; it's about compliance efficiency. If your API gateway is routing traffic through Frankfurt or Amsterdam for Norwegian users, you are adding 20-30ms of latency unnecessarily. Furthermore, with the Datatilsynet (Norwegian Data Protection Authority) tightening enforcement on GDPR transfers, keeping data within Norwegian borders simplifies your legal architecture.
CoolVDS data centers are located directly in Oslo. This provides two massive advantages:
- Physical Proximity: Speed of light constraints are real. Distance equals latency.
- Legal Compliance: Data stays in Norway, simplifying Schrems II compliance for your clients.
Conclusion: Test, Don't Guess
Once you have applied these changes, do not assume they worked. Test them. Use k6 or wrk to hammer your endpoint.
wrk -t12 -c400 -d30s https://your-api-endpoint.com/healthIf you see timeouts, check your dmesg logs for dropped packets. If you see high CPU usage but low throughput, you are likely context-switching too much—check your worker process count.
High-performance hosting is a game of inches. You fight for every millisecond. If you are tired of fighting your hosting provider's hardware limitations, it might be time to move your gateway to a platform built for the task.
Ready to drop your latency? Deploy a high-frequency NVMe instance on CoolVDS today and experience the difference of raw, unthrottled KVM power.