Skip to content

All GuidesInfrastructure

Production Backup Strategy for Self-Hosted

How to design and implement a production-grade backup strategy for self-hosted infrastructure using BorgBackup and Borgmatic — covering the 3-2-1 principle, encrypted deduplication, database-consistent snapshots, automated scheduling and tested disaster recovery.

06 / 02 / 202614 min read

Many USB Sticks

Feature image

THE PROBLEM

Why Most Self-Hosted Backup Strategies Fail

The most common failure in self-hosted backup strategies is not the absence of backups — it is the absence of tested recovery. A backup that has never been restored is not a backup. It is an untested assumption sitting on a hard drive.

Three failure patterns appear in almost every self-hosted incident post-mortem.

Scope Failure

Critical data exists outside the backup path. Configuration files, database dumps, and encryption keys are excluded because someone assumed they were covered elsewhere. They were not. After a disk failure, the data that matters most is exactly what was not backed up.

Consistency Failure

Backups run while databases are actively writing. The resulting snapshot captures a mid-transaction state that cannot be cleanly imported. The backup exists. The restore does not work. The difference only becomes apparent during an incident.

Recovery Failure

The restore procedure has never been executed on real hardware under real pressure. Documentation was written once and never validated. The first real test happens during an actual incident, with production data at risk and no time to troubleshoot.

A production backup strategy eliminates all three failure modes before they occur. Zero-Exposure Infrastructure guide

BACKUP PRINCIPLES

The 3-2-1 Rule Applied to Self-Hosted Infrastructure

The 3-2-1 rule is the minimum viable backup architecture: three copies of the data, on two different media types, with one copy stored offsite. For self-hosted infrastructure, this maps directly to a concrete implementation.

Copy 1 — Live Data

The primary copy is the production system itself — Docker volumes, configuration files, and databases running on the homeserver NVMe. This copy is always current but provides no protection against hardware failure, ransomware, or accidental deletion.

Copy 2 — Local Encrypted Repository

The second copy is an encrypted, deduplicated backup repository on a local USB drive. Updated daily via automated systemd timers, this provides fast local recovery without network dependency. The encryption ensures that physical access to the drive does not mean access to the data.

Copy 3 — Offsite Repository

The third copy is stored at a physically separate location — a second site, a trusted remote server, or an encrypted cloud target. This copy protects against physical failure, theft, and fire. Without it, a single physical incident can destroy both copies simultaneously.

Why BorgBackup

BorgBackup is the right tool for this architecture. It provides content-addressable deduplication, LZ4/ZSTD compression, and AES-256 encryption in a single tool. A 100GB dataset that changes 1% daily adds approximately 1GB per archive after the initial snapshot — making 12-month retention practical on modest hardware.

Backup Architecture Overview

Complete data flow from live services through pre-backup hooks to encrypted repository storage and automated verification.

01

Systemd timer triggers backup at scheduled time

→

02

Pre-backup hooks: database dumps written to staging directory

→

03

Borgmatic executes BorgBackup repository lock

→

04

Changed blocks deduplicated and compressed

→

05

AES-256 encryption applied before write to storage

→

06

Archive written to local USB repository

→

07

Prune: archives outside retention window removed

→

08

Compact: freed space reclaimed from repository segments

→

09

Notification sent: success or failure with archive stats

→

Daily automated backup pipeline with database-consistent snapshots

Encrypted at Rest

All backup data is AES-256 encrypted before being written to storage. The encryption key never leaves the server in plaintext — physical access to the drive provides no access to the data.

Deduplicated

BorgBackup splits data into content-addressable chunks. Only changed chunks are stored across archives — daily backups of large datasets consume a fraction of the raw data size.

Database-Consistent

Pre-backup hooks capture clean, importable database dumps before BorgBackup runs. Every archive contains databases in a state that can be restored without repair or recovery procedures.

Retention-Managed

Automated pruning removes archives outside the configured retention window after every backup run. Old data is cleaned up automatically — the repository never grows unbounded.

BACKUP SCOPE

Defining What Gets Backed Up

The backup scope must be explicitly defined — never assumed. For a Docker-based homeserver, five categories cover everything that matters.

Service Data

Docker volumes mounted under the services directory contain all persistent application data — databases, configuration, uploaded files, and application state. The entire services directory belongs in the backup path. No exceptions.

System Configuration

/etc contains network configuration, systemd units, cron jobs, and every system-level customization that would need manual reconstruction after a reinstall. Losing /etc means rebuilding from memory. It costs nothing to include it.

User Data

/home and /root contain dotfiles, SSH keys, shell configurations, and operational scripts maintained outside the services directory. These are small in size and high in replacement cost.

Database Staging

A dedicated staging directory receives pre-backup database dumps before BorgBackup runs. These dumps capture databases in a consistent, importable state — not as raw data files that may be mid-transaction. The staging directory is included in the BorgBackup path.

Secrets and Keys

Encryption keys, API credentials, and backup passphrases require special handling. They must be stored outside the primary backup repository — because they are what you need to access the backup when everything else is gone. A passphrase stored only inside the encrypted repository is permanently inaccessible if the repository is the only thing that survived.

DATABASE HOOKS

Database-Consistent Backups with Pre-Backup Hooks

Running BorgBackup against live database files produces backups that cannot be reliably restored. Database engines coordinate writes across multiple files — a snapshot taken mid-write captures an inconsistent state. The fix is pre-backup hooks that produce clean, importable dumps before BorgBackup runs.

MariaDB

The dump command uses --single-transaction to acquire a consistent read snapshot without locking tables, combined with --quick to stream rows rather than buffering the entire dataset in memory. Output is gzip-compressed and written to the staging directory with a datestamped filename.

InfluxDB

The native influx backup command captures the complete database state including retention policies and continuous queries. The output is a portable archive that restores cleanly to any InfluxDB instance without manual schema reconstruction.

SQLite-Based Services

For services backed by SQLite, Docker volumes are included directly in the BorgBackup path. SQLite's WAL mode ensures the on-disk state is always consistent for reads, making direct file backup safe without additional hooks. For additional safety, a SQLite online backup via the .backup command writes a clean copy to staging before BorgBackup runs. Vaultwarden guide

Services Requiring Maintenance Mode

Some applications — notably Nextcloud — must be placed into maintenance mode before backup to prevent file state from changing during the backup window. The pre-backup hook activates maintenance mode, BorgBackup runs, and a post-backup hook deactivates it. Skipping this step risks backing up a database and file store that are out of sync.

Backup Security Architecture

Four security layers protecting backup integrity and preventing unauthorized access to archived data.

Layer 01 · Encryption

AES-256 Repokey Encryption

All data encrypted before write to disk / Passphrase stored separately from backup media / Key export stored in secure offsite location / No plaintext data ever written to backup storage

Layer 02 · Integrity

Repository Verification

Weekly borg check validates repository consistency / Archive checksums verified against stored manifests / Corruption detected before it becomes a recovery problem / Automated alerts on verification failure

Layer 03 · Access Control

Filesystem Permissions

Backup scripts run as root with restricted PATH / Passphrase file readable only by root / Repository directory not accessible to service users / Mount point restricted to privileged access

Layer 04 · Secrets Management

Key & Passphrase Strategy

Passphrase stored in encrypted self-hosted password manager / Physical paper key stored offsite for emergency access / Borg key export maintained separately from repository / Recovery procedure documented and tested quarterly

Backup Performance Characteristics

Typical metrics for a production self-hosted infrastructure backup using BorgBackup with deduplication and LZ4 compression on ARM64 hardware.

Initial Backup

~2–4 hours

Daily Incremental

~3–8 minutes

Deduplication Ratio

60–80% space saving

Compression (LZ4)

20–40% additional saving

Recovery Time

~15–45 minutes

Retention — Daily

14 archives

Retention — Weekly

8 archives

Retention — Monthly

12 archives

RETENTION POLICY

Pruning, Compacting, and Managing Repository Growth

Without a retention policy, a backup repository grows indefinitely. Eventually it fills the available storage and the next backup fails silently. The data you needed most is gone because the disk was full when the incident happened.

keep_daily / keep_weekly / keep_monthly

Borgmatic implements retention via three parameters. The recommended configuration for a self-hosted homeserver: 14 daily archives, 8 weekly archives, 12 monthly archives. This covers granular recovery for recent incidents and long-term recovery for corruption or accidental deletion discovered weeks later.

borg prune

After every backup run, borg prune removes archives outside the retention window. It does not immediately free disk space — it marks data as eligible for removal.

borg compact

borg compact actually reclaims the freed space from the repository's segment files. It is a separate operation that is frequently omitted, causing repositories to grow on disk despite active pruning. Both operations belong in every automated backup run.

Deduplication and Retention Interaction

BorgBackup's deduplication means that data shared between archives is only removed when the last archive referencing that data is pruned. A file deleted six months ago remains recoverable for as long as an archive from that period falls within the retention window. This makes the retention window the true recovery horizon — not the date of the last backup.

DISASTER RECOVERY

Tested Recovery: The Only Metric That Matters

A backup strategy is only as reliable as its last tested recovery. Not its last successful backup run. Its last tested recovery — executed against real hardware, under real conditions, with the actual passphrase from offsite storage.

Hardware Preparation

Recovery begins before the incident. A replacement system running the same OS, BorgBackup installed, and the backup media connected. The passphrase and key export come from their offsite location — not from the failed system.

Archive Selection

borg list displays all available archives with timestamps. Select the most recent clean archive, or a specific point-in-time archive for targeted recovery. Individual paths can be extracted without restoring the full archive — enabling recovery of a single service without touching the rest of the system.

Restore Order

Service restoration follows dependency order: container runtime first, then infrastructure services (reverse proxy, DNS), then data services (databases), then application services. Database dumps are imported before application containers start — applications that initialize against an empty database can corrupt their own schema.

Quarterly Recovery Drills

The recovery process must be tested on a quarterly schedule. Not simulated — executed. Real archives, real hardware or an isolated VM, full service validation. The drill validates that the restore command succeeds and that the restored services actually work. Recovery time should be recorded against your acceptable RTO. If it exceeds it, the architecture needs revision before an incident forces that discovery. Grafana guide

Production Backup Implementation Checklist

Complete this checklist before considering your backup strategy production-ready.

BorgBackup and Borgmatic installed and verified

Backup repository initialized with repokey encryption

Passphrase stored in password manager and offsite physical copy

Borg key exported and stored separately from repository

All critical data paths explicitly defined in backup configuration

Database pre-backup hooks configured and tested for each engine

Borgmatic configuration validated with dry-run

Systemd timer active and verified

Notifications configured for success and failure

First full backup completed and archive listed successfully

Retention policy configured with prune and compact enabled

Weekly repository integrity check scheduled

Recovery procedure documented step-by-step

Test recovery executed against isolated hardware

Restored services validated as fully functional

Recovery time recorded and meets acceptable RTO

Never Store Your Encryption Key With Your Backup

The BorgBackup repokey and passphrase are the only way to access your encrypted repository. If they are stored only on the same hardware as the backup — or only in a password manager with no physical offsite copy — a single point of failure makes your entire archive permanently inaccessible. Export the key, print a paper copy, store it somewhere physically separate from your server. Do this before you need it.

Continue reading

Digital locked vault

Vaultwarden: Self-Hosted Password Management Done Right

A complete guide to deploying Vaultwarden — the lightweight Bitwarden-compatible server — on your own infrastructure. Covers installation, HTTPS configuration, backup strategy, and hardening for a production-ready password manager.

chart with high and low bars background

Intermediate12 min

Grafana: Infrastructure Monitoring for Self-Hosted Environments

A complete guide to deploying Grafana for self-hosted infrastructure monitoring. Covers container metrics, security event dashboards, geographic traffic visualization via Cloudflare GraphQL, and production alerting.

Server room data station technology background

Intermediate14 min

Traefik: Production-Grade Reverse Proxy for Self-Hosted Infrastructure

A complete guide to deploying Traefik as your self-hosted reverse proxy. Covers Docker provider configuration, automatic TLS, security headers, middleware, and integration with Cloudflare Tunnel for zero-port-exposure routing.