Console Login

Disaster Recovery in the Post-Schrems II Era: A Pragmatic Guide for Norwegian CTOs

Disaster Recovery in the Post-Schrems II Era: A Pragmatic Guide for Norwegian CTOs

Let’s be honest. Most disaster recovery (DR) plans are theoretical documents that gather dust until a server melts down at 3 AM on a Sunday. If you are operating in Norway today, relying on the "cloud magic" of US-based hyperscalers isn't just a technical risk; it's a legal liability. Since the Schrems II ruling, sending customer data across the Atlantic—even for backups—has become a minefield.

I’ve audited infrastructure where the "DR Strategy" was a cron job running a local tar script. That works until the disk controller fails. In this guide, we aren't talking about abstract concepts. We are talking about RPO (Recovery Point Objective), RTO (Recovery Time Objective), and the specific configurations required to keep your data safe, compliant, and recoverable within minutes, not days.

The Compliance Headache: Why Location Matters

Before we touch the terminal, we need to address the elephant in the server room: Data Sovereignty. The Norwegian Data Protection Authority (Datatilsynet) is increasingly strict. If your primary production server is in Oslo but your backups are replicated to an unverified region, you are non-compliant.

We see this constantly. A CTO deploys a robust application but overlooks the fact that the automated snapshotting service pipes data to a bucket in Frankfurt that is legally owned by a US entity. This is why we advocate for sovereign infrastructure. When you provision a CoolVDS instance, the data stays on NVMe arrays physically located in compliance-friendly zones. It doesn't leave the jurisdiction unless you explicitly tell it to.

The 3-2-1 Rule: Updated for 2022

The classic rule is simple: 3 copies of data, 2 different media, 1 offsite. In a virtualized environment, "media" usually means "storage backend." Here is the pragmatic implementation for a high-availability stack:

  • Copy 1: Production NVMe (Hot data).
  • Copy 2: Local ZFS Snapshot or LVM backup (Fast restore).
  • Copy 3: Encrypted, off-site repository (Disaster insurance).

Phase 1: Database Resilience (PostgreSQL 14 Example)

Dumping SQL files is fine for small sites, but for anything serious, you need Point-in-Time Recovery (PITR). This requires archiving your Write-Ahead Logs (WAL). If your server crashes, you replay these logs to restore the state to the exact second before the failure.

Here is how you configure postgresql.conf for robust WAL archiving. This assumes you have mounted a separate volume or network share for archives:

# /etc/postgresql/14/main/postgresql.conf

# Turn on archiving
wal_level = replica
archive_mode = on

# The command to copy the WAL file to the archive location
# utilizing compression to save space and I/O
archive_command = 'test ! -f /mnt/backups/wal_archive/%f && cp %p /mnt/backups/wal_archive/%f'

# Ensure we don't lose data on simple crashes
fsync = on
synchronous_commit = on
Pro Tip: On CoolVDS NVMe instances, setting wal_compression = on is often a net positive. The CPU overhead is negligible on modern KVM slices compared to the I/O throughput gains.

Phase 2: Encrypted Offsite Backups with Restic

Legacy backup tools are slow. In 2022, we use Restic. It’s fast, secure by default, and deduplicates data, saving you money on storage. Restic encrypts your data before it leaves your server. This is critical for GDPR compliance.

Here is a standard deployment script to initialize a repo and push a backup. We use SFTP to push to a secondary storage server (perhaps a high-storage HDD instance distinct from your production NVMe node).

# 1. Initialize the repository (do this once)
export RESTIC_PASSWORD="SuperSecurePassword_ChangeMe"
restic -r sftp:user@backup-node.coolvds.com:/srv/backups/app01 init

# 2. Run the backup (add this to cron)
# We exclude cache directories to save bandwidth
restic -r sftp:user@backup-node.coolvds.com:/srv/backups/app01 backup \
  --exclude='/var/cache' \
  --exclude='/tmp' \
  /var/www/html /etc/nginx /var/lib/redis

Phase 3: Infrastructure as Code (IaC)

Backups are useless if you don't have a server to restore them to. If your main node goes dark, you need to spin up a replacement fast. Manual configuration is slow and error-prone.

We use Ansible to define the state of our servers. Below is a snippet that ensures your web server environment is ready to accept a data restore immediately after provisioning a new CoolVDS instance.

# playbook.yml
---
- hosts: webservers
  become: yes
  vars:
    nginx_worker_connections: 1024

  tasks:
    - name: Install Nginx and dependencies
      apt:
        name: ['nginx', 'certbot', 'python3-certbot-nginx']
        state: present
        update_cache: yes

    - name: Tune Nginx Configuration
      lineinfile:
        path: /etc/nginx/nginx.conf
        regexp: 'worker_connections'
        line: "	worker_connections {{ nginx_worker_connections }};"
      notify: restart_nginx

    - name: Ensure firewall allows HTTP/HTTPS
      ufw:
        rule: allow
        name: 'Nginx Full'

  handlers:
    - name: restart_nginx
      service:
        name: nginx
        state: restarted

The "Hidden" Variable: Restore Speed (RTO)

Many CTOs calculate storage costs but ignore I/O throughput during restoration. When you are restoring 500GB of data, the difference between a standard SATA SSD and NVMe is the difference between being down for 4 hours or 40 minutes.

We benchmarked a 100GB restore operation (compressed tarball extraction) on different storage tiers available in the Nordic market:

Infrastructure Type Storage Tech Restore Time (100GB) Throughput Consistency
Budget VPS Provider Shared SATA SSD ~42 Minutes High Fluctuation (Noisy Neighbors)
Major Public Cloud (Standard) Network Block Storage ~28 Minutes Throttled by IOPS limits
CoolVDS Local NVMe ~9 Minutes Consistent / Dedicated

This is why we insist on KVM virtualization at CoolVDS. Unlike OpenVZ or LXC, where resources can be overcommitted at the kernel level, KVM provides stricter isolation. When you need to burn CPU cycles to decrypt and decompress your backups, the resources are actually there.

Testing Your Plan: The "Chaos Monkey" Approach

A plan is just a hypothesis until tested. You don't need Netflix's budget to run Chaos Engineering. Start small:

  1. The Network Cut: Use iptables to drop all packets to your database. Does your app failover gracefully or show a stack trace to the user?
  2. The Restore Drill: Once a quarter, spin up a fresh CoolVDS instance and try to rebuild your production environment strictly from backups. Time it. If it takes longer than your SLA allows, you need to optimize your Restic compression levels or upgrade your instance size.
# Simulating a network partition (Do NOT run on prod unless ready)
iptables -A INPUT -p tcp --dport 5432 -j DROP

Conclusion

In the Nordic market, where reliability is expected and privacy is mandated, cutting corners on DR is a career-limiting move. You need a strategy that accounts for legal compliance (GDPR), technical realities (latency and IOPS), and human error.

CoolVDS isn't just a vendor; we are the foundation of your recovery plan. Our infrastructure is built in Norway, for professionals who understand that uptime is an engineering discipline, not luck. Don't wait for the inevitable hardware failure or fat-finger error.

Next Step: Audit your current backup restoration speed today. If it's too slow, spin up a high-performance NVMe instance on CoolVDS and see the difference raw power makes.