Your PDF Disaster Recovery Plan Will Fail. Here is the Code Instead.
I have seen grown men cry in server rooms. It wasn't because of a breakup or a lost football match. It was because /var/lib/mysql was gone, the RAID controller had silently corrupted the mirror three weeks ago, and their only backup was a shell script that hadn't run since Christmas. Hope is not a strategy. If your Disaster Recovery (DR) plan is a document stored on the same file server that just caught fire, you don't have a plan. You have a suicide note.
In the Nordic market, where latency to the NIX (Norwegian Internet Exchange) is measured in single-digit milliseconds and Datatilsynet (The Norwegian Data Protection Authority) does not accept "oops" as a legal defense, uptime is binary. You are up, or you are bleeding money. This guide isn't about theory. It's about the cold, hard mechanics of surviving a catastrophic infrastructure failure using tools available in 2025.
The "Ransomware-Proof" Architecture
The biggest threat in 2025 isn't hardware failure; it's an automated script encrypting your filesystem. Traditional backups are useless here because the attacker will find your mounted backup drive and encrypt that too. You need immutability.
Pro Tip: Never mount your backup storage permanently on the production server. Use a "pull" mechanism where a secured backup server connects to production, pulls the data, and closes the connection.
If you are running on CoolVDS, you have access to block-level snapshots. But for application consistency, you need to dump data properly before snapshotting. Here is a secure, pull-based backup wrapper using restic that prevents the production server from ever overwriting historical data.
# Initialize a repository with append-only mode restriction (conceptual implementation)
restic init --repo sftp:user@backup-server:/srv/restic-repo
# The backup command
restic backup /var/www/html --exclude-file=excludes.txt --tag productionBut raw files are easy. Databases are where careers end.
Database Replication: The Only RPO Zero Strategy
If you rely on nightly pg_dump, you are accepting up to 24 hours of data loss (RPO). For a high-traffic Magento store or a SaaS platform serving Oslo businesses, that is unacceptable. You need Point-in-Time Recovery (PITR).
In a CoolVDS environment, we configure PostgreSQL for Write-Ahead Log (WAL) archiving. This allows you to replay every single transaction that happened since the last base backup. Here is the postgresql.conf configuration you need for a robust setup:
# /etc/postgresql/16/main/postgresql.conf
# Turn on archiving
wal_level = replica
archive_mode = on
# Ship WAL files to a secure location (CoolVDS Object Storage or separate NVMe volume)
archive_command = 'test ! -f /mnt/backups/%f && cp %p /mnt/backups/%f'
# Performance tuning for NVMe (crucial for CoolVDS instances)
random_page_cost = 1.1
effective_io_concurrency = 200
Why random_page_cost = 1.1? Because CoolVDS uses pure NVMe storage. The default Postgres setting assumes spinning rust disks. If you don't tune this, the query planner will ignore indexes and perform sequential scans, killing your CPU.
Automating the Resurrection with Ansible
Recovery time (RTO) is mostly human latency. It takes time to find credentials, provision a new VPS, and remember the config. Automate the infrastructure provisioning.
Below is a stripped-down Ansible playbook that provisions a recovery server. This script assumes you have spun up a fresh CoolVDS instance via API and need to get it ready for data restoration immediately.
---
- name: Disaster Recovery Protocol - Stage 1
hosts: recovery_vps
become: true
vars:
postgres_version: "16"
tasks:
- name: Ensure system is updated
apt:
update_cache: yes
upgrade: dist
- name: Install essential packages
apt:
name:
- postgresql-{{ postgres_version }}
- nginx
- restic
- htop
state: present
- name: Stop Postgres to prepare for data overwrite
service:
name: postgresql
state: stopped
- name: Restore Base Backup (Simulated)
command: "/usr/bin/rsync -avz backup_user@offsite-storage:/var/lib/postgresql/{{ postgres_version }}/main/ /var/lib/postgresql/{{ postgres_version }}/main/"
register: rsync_result
- name: Fix permissions
file:
path: "/var/lib/postgresql/{{ postgres_version }}/main"
owner: postgres
group: postgres
recurse: yes
- name: Start Postgres
service:
name: postgresql
state: started
The Local Factor: GDPR and Latency
Why not just dump everything to AWS S3 in Virginia? Schrems II. If you are handling Norwegian customer data, transferring it outside the EEA (European Economic Area) without strict safeguards is a legal minefield. Furthermore, the speed of recovery matters.
| Factor | US Cloud Storage | CoolVDS (Norway/Nordic) |
|---|---|---|
| GDPR Compliance | Complex / Risky | Native / Compliant |
| Latency to Oslo | 80ms - 120ms | 2ms - 10ms |
| Throughput Cost | High Egress Fees | Predictable / Unmetered |
Restoring 500GB of data across the Atlantic takes hours. Restoring it from a CoolVDS backup node in the same datacenter (or a neighboring city) takes minutes. When your boss is breathing down your neck, those minutes save your job.
Testing the Unthinkable
A backup is Schrödinger's file until you restore it. You must run drill tests. I recommend a monthly "Game Day" where you spin up a temporary CoolVDS instance and run your Ansible recovery playbook against it.
If the script fails, you fix the script. If the data is corrupt, you fix the backup process. You do this on a Tuesday morning, not a Saturday night when the site is actually down.
Disaster recovery is not a product you buy; it is a discipline you practice. However, the underlying metal matters. You need high IOPS to ingest restored data quickly, and you need a network that doesn't choke under load. This is why we built our infrastructure on high-performance NVMe arrays rather than cheap Ceph clusters.
Don't wait for the fire. Start your disaster recovery test today by deploying a sandbox instance on CoolVDS.