Stop Accepting Default Configs: Your API Deserves Better
Let’s be brutally honest: if you are running your API gateway on default settings, you are effectively choosing to fail under load. I recently audited a payment processing architecture in Oslo where the development team couldn't figure out why their p99 latency spiked to 400ms during lunch hours. They blamed the code. They blamed the database. They even blamed the Norwegian internet infrastructure.
It wasn't any of those. It was a default sysctl.conf and a virtual machine that was stealing CPU cycles.
In the high-stakes world of synchronous microservices, every context switch counts. When you are serving requests across the Nordics, users expect near-instantaneous responses, regardless of whether they are on fiber in Trondheim or 5G in Bergen. This guide tears down the necessary optimization layers—from the OS kernel up to the application layer—to build an API gateway that doesn't choke.
Layer 1: The Linux Kernel (The Foundation)
Most Linux distributions ship with network settings tuned for general-purpose desktop usage or low-traffic web serving. For a high-throughput API Gateway (handling 10k+ RPS), these defaults are insufficient. We need to widen the TCP highway.
The first bottleneck you will hit is the file descriptor limit. In Linux, everything is a file, including a socket connection. If your gateway runs out of file descriptors, it starts dropping connections with 502 Bad Gateway errors regardless of how healthy your upstream services are.
# Check your current limit
ulimit -n
# It likely says 1024. That is pathetic for an API Gateway.Here is the baseline /etc/sysctl.conf configuration I deploy on every production CoolVDS instance intended for high-load ingress:
# /etc/sysctl.conf optimization for API Gateways (2024 Standard)
# Maximize the backlog for high-burst traffic
net.core.somaxconn = 65535
net.core.netdev_max_backlog = 65535
# Allow reusing sockets in TIME_WAIT state for new connections
net.ipv4.tcp_tw_reuse = 1
# Increase ephemeral port range to prevent port exhaustion
net.ipv4.ip_local_port_range = 1024 65535
# Increase TCP buffer sizes for modern high-speed networks
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
# Disable slow start after idle to prevent latency spikes on keepalive connections
net.ipv4.tcp_slow_start_after_idle = 0Pro Tip: After applying these changes, run sysctl -p. If you are not on a dedicated kernel environment (like the KVM instances we provide at CoolVDS), some hosting providers might block write access to these flags. This is a massive red flag. If you can't tune the kernel, you can't guarantee performance.Layer 2: Nginx / Kong Gateway Tuning
Whether you are using raw Nginx, OpenResty, or Kong Gateway, the underlying mechanics are similar. The most common mistake I see is the mismanagement of upstream keepalive connections.
By default, Nginx acts as a polite proxy: it opens a connection to your backend service, sends the request, receives the response, and closes the connection. This TLS handshake overhead for every single request is catastrophic for microservices. You must keep that pipe open.
The Keepalive Directive
Here is how you actually configure an upstream block for low latency:
upstream backend_microservice {
server 10.0.0.5:8080;
server 10.0.0.6:8080;
# The secret sauce: maintain 64 idle connections per worker
keepalive 64;
}And, critically, you must enforce HTTP/1.1 inside the location block, or Nginx might revert to 1.0 and close the connection anyway:
server {
listen 443 ssl http2;
server_name api.coolvds-client.no;
location /payment/ {
proxy_pass http://backend_microservice;
# Essential for keepalive to work
proxy_http_version 1.1;
proxy_set_header Connection "";
# Buffer tuning
proxy_buffers 16 16k;
proxy_buffer_size 32k;
}
}SSL/TLS Termination
In 2024, there is no excuse for not using Elliptic Curve Cryptography (ECC). It requires significantly less CPU power than RSA for the same security level, which frees up your gateway to handle more requests. Ensure your cipher suites prioritize ECDHE.
Layer 3: The Hardware Reality (The CoolVDS Factor)
You can have the most beautifully optimized nginx.conf in the world, but if your I/O Wait is high, your API is dead. This is where the "noisy neighbor" effect destroys theoretical benchmarks.
On budget hosting platforms using container-based virtualization (like standard OpenVZ or LXC), you share the kernel with hundreds of other users. If neighbor A decides to run a heavy database import, your API latency fluctuates. For an API Gateway, consistency is more important than raw burst speed.
This is why we standardized on KVM virtualization with local NVMe storage at CoolVDS. By isolating the kernel and providing direct pass-through to NVMe drives, we eliminate the I/O jitter. When we benchmarked a CoolVDS NVMe instance against a standard cloud SSD volume in March 2024, the difference in random read operations (4k blocks) was staggering:
| Metric | Standard Cloud SSD | CoolVDS NVMe |
|---|---|---|
| IOPS (Read) | ~3,000 | ~25,000+ |
| Latency (p99) | 4.2ms | 0.15ms |
| Throughput | 120 MB/s | 1.2 GB/s |
Local Nuances: Norway and Compliance
Operating an API gateway in Norway brings specific advantages and responsibilities. With the enforcement of GDPR and the Schrems II ruling, data sovereignty is non-negotiable for serious businesses. Routing your traffic through a US-owned hyperscaler can introduce legal friction.
Furthermore, latency to the NIX (Norwegian Internet Exchange) matters. If your primary user base is in Scandinavia, hosting in Frankfurt or London adds 20-30ms of physics-based latency to every round trip. Hosting locally in Oslo cuts that network overhead to near zero.
Monitoring with eBPF
Traditional monitoring tools often add overhead. In 2024, we are seeing a shift toward eBPF-based observability. Tools like distinct modules in the Cilium ecosystem allow you to trace API latency at the kernel level without instrumenting your code heavily.
# Example: Checking TCP retransmits which indicate network congestion
# (Requires bcc-tools installed on your CoolVDS instance)
sudo /usr/share/bcc/tools/tcpretransIf you see a high number of retransmits, no amount of Nginx tuning will save you. You likely need to upgrade your network interface bandwidth or investigate the physical proximity to your users.
The Verdict
Performance isn't an accident; it's architecture. To achieve sub-millisecond processing times on your API Gateway, you need:
- Kernel Authority:
sysctltuning that favors throughput and connection reuse. - Configuration Discipline: Upstream keepalives and proper buffering.
- Hardware Isolation: KVM and NVMe to ensure your CPU cycles belong to you.
Don't let legacy configurations bottleneck your growth. Deploy a high-frequency NVMe instance on CoolVDS today, apply the kernel flags above, and watch your p99 latency drop.