Production Backup Strategy for Self-Hosted
How to design and implement a production-grade backup strategy for self-hosted infrastructure using BorgBackup and Borgmatic — covering the 3-2-1 principle, encrypted deduplication, database-consistent snapshots, automated scheduling and tested disaster recovery.

THE PROBLEM
Why Most Self-Hosted Backup Strategies Fail
The most common failure in self-hosted backup strategies is not the absence of backups — it is the absence of tested recovery. A backup that has never been restored is not a backup. It is an untested assumption sitting on a hard drive.
Three failure patterns appear in almost every self-hosted incident post-mortem.
Scope Failure
Critical data exists outside the backup path. Configuration files, database dumps, and encryption keys are excluded because someone assumed they were covered elsewhere. They were not. After a disk failure, the data that matters most is exactly what was not backed up.
Consistency Failure
Backups run while databases are actively writing. The resulting snapshot captures a mid-transaction state that cannot be cleanly imported. The backup exists. The restore does not work. The difference only becomes apparent during an incident.
Recovery Failure
The restore procedure has never been executed on real hardware under real pressure. Documentation was written once and never validated. The first real test happens during an actual incident, with production data at risk and no time to troubleshoot.
A production backup strategy eliminates all three failure modes before they occur. Zero-Exposure Infrastructure guide
BACKUP PRINCIPLES
The 3-2-1 Rule Applied to Self-Hosted Infrastructure
The 3-2-1 rule is the minimum viable backup architecture: three copies of the data, on two different media types, with one copy stored offsite. For self-hosted infrastructure, this maps directly to a concrete implementation.
Copy 1 — Live Data
The primary copy is the production system itself — Docker volumes, configuration files, and databases running on the homeserver NVMe. This copy is always current but provides no protection against hardware failure, ransomware, or accidental deletion.
Copy 2 — Local Encrypted Repository
The second copy is an encrypted, deduplicated backup repository on a local USB drive. Updated daily via automated systemd timers, this provides fast local recovery without network dependency. The encryption ensures that physical access to the drive does not mean access to the data.
Copy 3 — Offsite Repository
The third copy is stored at a physically separate location — a second site, a trusted remote server, or an encrypted cloud target. This copy protects against physical failure, theft, and fire. Without it, a single physical incident can destroy both copies simultaneously.
Why BorgBackup
BorgBackup is the right tool for this architecture. It provides content-addressable deduplication, LZ4/ZSTD compression, and AES-256 encryption in a single tool. A 100GB dataset that changes 1% daily adds approximately 1GB per archive after the initial snapshot — making 12-month retention practical on modest hardware.
Backup Architecture Overview
Complete data flow from live services through pre-backup hooks to encrypted repository storage and automated verification.
Daily automated backup pipeline with database-consistent snapshots
Encrypted at Rest
All backup data is AES-256 encrypted before being written to storage. The encryption key never leaves the server in plaintext — physical access to the drive provides no access to the data.
Deduplicated
BorgBackup splits data into content-addressable chunks. Only changed chunks are stored across archives — daily backups of large datasets consume a fraction of the raw data size.
Database-Consistent
Pre-backup hooks capture clean, importable database dumps before BorgBackup runs. Every archive contains databases in a state that can be restored without repair or recovery procedures.
Retention-Managed
Automated pruning removes archives outside the configured retention window after every backup run. Old data is cleaned up automatically — the repository never grows unbounded.
BACKUP SCOPE
Defining What Gets Backed Up
The backup scope must be explicitly defined — never assumed. For a Docker-based homeserver, five categories cover everything that matters.
Service Data
Docker volumes mounted under the services directory contain all persistent application data — databases, configuration, uploaded files, and application state. The entire services directory belongs in the backup path. No exceptions.
System Configuration
/etc contains network configuration, systemd units, cron jobs, and every system-level customization that would need manual reconstruction after a reinstall. Losing /etc means rebuilding from memory. It costs nothing to include it.
User Data
/home and /root contain dotfiles, SSH keys, shell configurations, and operational scripts maintained outside the services directory. These are small in size and high in replacement cost.
Database Staging
A dedicated staging directory receives pre-backup database dumps before BorgBackup runs. These dumps capture databases in a consistent, importable state — not as raw data files that may be mid-transaction. The staging directory is included in the BorgBackup path.
Secrets and Keys
Encryption keys, API credentials, and backup passphrases require special handling. They must be stored outside the primary backup repository — because they are what you need to access the backup when everything else is gone. A passphrase stored only inside the encrypted repository is permanently inaccessible if the repository is the only thing that survived.
DATABASE HOOKS
Database-Consistent Backups with Pre-Backup Hooks
Running BorgBackup against live database files produces backups that cannot be reliably restored. Database engines coordinate writes across multiple files — a snapshot taken mid-write captures an inconsistent state. The fix is pre-backup hooks that produce clean, importable dumps before BorgBackup runs.
MariaDB
The dump command uses --single-transaction to acquire a consistent read snapshot without locking tables, combined with --quick to stream rows rather than buffering the entire dataset in memory. Output is gzip-compressed and written to the staging directory with a datestamped filename.
InfluxDB
The native influx backup command captures the complete database state including retention policies and continuous queries. The output is a portable archive that restores cleanly to any InfluxDB instance without manual schema reconstruction.
SQLite-Based Services
For services backed by SQLite, Docker volumes are included directly in the BorgBackup path. SQLite's WAL mode ensures the on-disk state is always consistent for reads, making direct file backup safe without additional hooks. For additional safety, a SQLite online backup via the .backup command writes a clean copy to staging before BorgBackup runs. Vaultwarden guide
Services Requiring Maintenance Mode
Some applications — notably Nextcloud — must be placed into maintenance mode before backup to prevent file state from changing during the backup window. The pre-backup hook activates maintenance mode, BorgBackup runs, and a post-backup hook deactivates it. Skipping this step risks backing up a database and file store that are out of sync.
Backup Security Architecture
Four security layers protecting backup integrity and preventing unauthorized access to archived data.
- All data encrypted before write to disk / Passphrase stored separately from backup media / Key export stored in secure offsite location / No plaintext data ever written to backup storage
- Weekly borg check validates repository consistency / Archive checksums verified against stored manifests / Corruption detected before it becomes a recovery problem / Automated alerts on verification failure
- Backup scripts run as root with restricted PATH / Passphrase file readable only by root / Repository directory not accessible to service users / Mount point restricted to privileged access
- Passphrase stored in encrypted self-hosted password manager / Physical paper key stored offsite for emergency access / Borg key export maintained separately from repository / Recovery procedure documented and tested quarterly
Backup Performance Characteristics
Typical metrics for a production self-hosted infrastructure backup using BorgBackup with deduplication and LZ4 compression on ARM64 hardware.
RETENTION POLICY
Pruning, Compacting, and Managing Repository Growth
Without a retention policy, a backup repository grows indefinitely. Eventually it fills the available storage and the next backup fails silently. The data you needed most is gone because the disk was full when the incident happened.
keep_daily / keep_weekly / keep_monthly
Borgmatic implements retention via three parameters. The recommended configuration for a self-hosted homeserver: 14 daily archives, 8 weekly archives, 12 monthly archives. This covers granular recovery for recent incidents and long-term recovery for corruption or accidental deletion discovered weeks later.
borg prune
After every backup run, borg prune removes archives outside the retention window. It does not immediately free disk space — it marks data as eligible for removal.
borg compact
borg compact actually reclaims the freed space from the repository's segment files. It is a separate operation that is frequently omitted, causing repositories to grow on disk despite active pruning. Both operations belong in every automated backup run.
Deduplication and Retention Interaction
BorgBackup's deduplication means that data shared between archives is only removed when the last archive referencing that data is pruned. A file deleted six months ago remains recoverable for as long as an archive from that period falls within the retention window. This makes the retention window the true recovery horizon — not the date of the last backup.
DISASTER RECOVERY
Tested Recovery: The Only Metric That Matters
A backup strategy is only as reliable as its last tested recovery. Not its last successful backup run. Its last tested recovery — executed against real hardware, under real conditions, with the actual passphrase from offsite storage.
Hardware Preparation
Recovery begins before the incident. A replacement system running the same OS, BorgBackup installed, and the backup media connected. The passphrase and key export come from their offsite location — not from the failed system.
Archive Selection
borg list displays all available archives with timestamps. Select the most recent clean archive, or a specific point-in-time archive for targeted recovery. Individual paths can be extracted without restoring the full archive — enabling recovery of a single service without touching the rest of the system.
Restore Order
Service restoration follows dependency order: container runtime first, then infrastructure services (reverse proxy, DNS), then data services (databases), then application services. Database dumps are imported before application containers start — applications that initialize against an empty database can corrupt their own schema.
Quarterly Recovery Drills
The recovery process must be tested on a quarterly schedule. Not simulated — executed. Real archives, real hardware or an isolated VM, full service validation. The drill validates that the restore command succeeds and that the restored services actually work. Recovery time should be recorded against your acceptable RTO. If it exceeds it, the architecture needs revision before an incident forces that discovery. Grafana guide
Production Backup Implementation Checklist
Complete this checklist before considering your backup strategy production-ready.
Never Store Your Encryption Key With Your Backup
The BorgBackup repokey and passphrase are the only way to access your encrypted repository. If they are stored only on the same hardware as the backup — or only in a password manager with no physical offsite copy — a single point of failure makes your entire archive permanently inaccessible. Export the key, print a paper copy, store it somewhere physically separate from your server. Do this before you need it.