A Ransomware Resistant Backup Strategy With Borg

Setting The Stage

I have recently been moving all of my personal digital materials away from the big cloud providers for a number of reasons:

Cost - My footprint at Hetzner is only €70/mo and provides me with about 7TB of disk on the server and 5TB in a "Storage Box"
Privacy - I just don't want Google/Apple/etc... training their AI models or optimizing their Ad networks on my data
Security - I can carefully manage how to secure my data, perform updates, handle encryption (at-rest and in-flight), and much more

There are some negative impacts of doing this though:

Complex - It requires someone who is knowledgable to install, configure, and maintain

One aspect I have been working on for a few weeks is how to perform backups on my relatively large data store. My Nextcloud data alone is over 1TB. How do you back up that data securely (e.g. avoiding the possibility of malware, bitrot, and ransomware) and cost effectively?

One place I have seen traditional IT shops have issues is when there is a malware/ransomware incident, the malware/ransomware has access to the on-site backups and can damage/destory those backups. Moreover, for automation purposes, it's convenient for the credentials to the off-site backup store to be on the systems where they are used, thus providing an opportunity for the malware/ransomware to damage or destroy even the offsite backups. I wanted to make sure I avoided all of those potential issues.

Choosing A Backup Application

I put together some criteria for what I would want in a backup solution:

Works well on Linux
Scriptable
Performs deduplication of data
Compresses data
Can send backups over the network

Based on these criteria, I experimented with several tools and decided on borgbackup for the following reasons:

Deduplication
Multiple compression algortihms supported
Can work over SSH using SSH keys (even from the SSH Agent, which will be important later)
Low overhead
Fast

The deduplication and compression allowed me to backup my data with a 40% reduction in the storage versus what I have on disk on the server

Backup Requirements

The classic 3-2-1 rule for backups is a good basis for anyone configuring backups for their personal data and for enterprises. The short explanation of the 3-2-1 rule is this:

Have AS LEAST 3 copies of your data (Your live data and two other copies)
Use AT LEAST 2 media types for your backups (disk, cloud, tape, etc...)
Store AT LEAST 1 copy of your data off-site

To give you an idea, here's how I satisfied the 3-2-1 rule for my setup:

3 copies - One on the Hetzner server, one on the Hetzner storagebox cloud storage, and one at my house on a RAID5 storage enclosure
2 Media types - Disk and Cloud storage
1 Offsite - Well, I keep a copy in my house on my disk enclosure (FYI, QNAP TR-004 if you are curious)

Borg Basics

I won't go into detail in how to use Borg. Their documentation is decent and more comprehensive than what I could share here. What I will discuss is how I am using Borg. First, when I do my daily backups, I use borg to send them to my Hetzner storagebox over SSH. An example of the command is below:

export BORG_PASSPHRASE=<REDACTED>; borg create --compression=zlib --exclude-from=/root/exclude_list.txt ssh://storagebox/home/borg::$(date +%A) /

I execute this command and it uses a combination of my ~/.ssh/config and my ssh-agent to automatically authenticate my server to the storagebox. Now, if I were to store my SSH keys on the server, that would make it so that anyone compromising my server would immediately have access to my storagebox (and perhaps other) data. So, I do not store my SSH keys on the server. Instead, I initiate backups from my media server at home and forward my SSH keys using the ssh-agent. This ensures that the SSH keys are never stored locally on my server and are only ever accessible by the active session while it's connected.

By default, borg only creates differential backups, so backups after the initial full backup generally take very little time and very little space.

Daily Backups

From my home media box (A mini-pc with an external USB-C attached RAID array), I schedule a job once per day to open an SSH connection to the server with SSH Agent forwarding and then run the borg backup. Borg transparently uses my SSH Agent keys to send the backups to the storagebox in the cloud, and when done the SSH session and the corresponding agent are terminated and do not leave any secrets on the server.

Weekly Off-Site Backups

Additionally, my home media box will SSH on Saturdays to the server and initiate a borg backup to my house. This is also done using SSH and SSH Agent keys, but in this case I also use SSH Remote port forwarding to allow borg to connect back to my media box like this:

ssh -R 2022:127.0.0.1:22 -A -t hetznerroot << EOF   # Connect to the server and set up a port forward from 2022 on the server back to 22 on my media box
for I in nextcloud onedev keycloak                  # For each of my PostgreSQL-backed services, dump the database to a SQL file
do
  rm -f /opt/${I}/db-${I}-*.sql
  podman exec -it ${I}-postgresql bash -c "PGPASSWORD=${I} pg_dumpall -h 127.0.0.1 -f /tmp/db-${I}-${TIMESTAMP}.sql --username=${I} -w"
  podman cp --overwrite ${I}-postgresql:/tmp/db-${I}-${TIMESTAMP}.sql /opt/${I}/db-${I}-${TIMESTAMP}.sql
  podman exec -it ${I}-postgresql bash -c "rm -f /tmp/*.sql"
done
export BORG_PASSPHRASE='<REDACTED>'                 # Export the Borg passphrase as an environment variable

# Have borg backup over the port-forwarded SSH session to my media box
borg create --exclude-from=/root/exclude_list.txt ssh://dphillips@127.0.0.1:2022/Media/borg::$(date +%A) /
exit

Again, all of this is using SSH Keys provided by the SSH Agent, and the credentials are never permanently stored on the server. Be aware though, that the SSH Agent creates a UNIX socket (e.g. /tmp/ssh-XXXXO18shl/agent.2419171) which could be accessed if someone has compromised the server.

Preparing Your SSH Keys

First, I highly recommend using different SSH keys for authenticating to the server and authenticating to the cloud storage. This will just add more complexity for anyone trying to compromise your server or your data. On the host where you plan to do the off-site backup storage, create 2 SSH keys for use in this scenario:

ssh-keygen -f ~/.ssh/server_root -t ed25519 -N ''
ssh-keygen -f ~/.ssh/cloud_storage -t ed25519 -N ''

Next, create your SSH config for the user that scheduled backup will run as

Host server
    Hostname server.yourdomain.tld
    User root
    IdentityFile ~/.ssh/server_root
    ForwardAgent yes
Host storage
    Hostname storage.yourdomain.tld
    User <REDACTED>
    ProxyJump server
    IdentityFile ~/.ssh/cloud_storage

Confirm that your SSH Agent is running:

ssh-add -l
The agent has no identities.

If you get a result like Error connecting to agent: Connection refused, you can start the SSH Agent with the following command:

eval $(ssh-agent)     # Start the SSH agent and set apprporiate environment variables to use it in the current shell session

Add your SSH keys to the agent:

ssh-add ~/.ssh/server_root
ssh-add ~/.ssh/cloud_storage

Configure the server and the storage to accept your keys (You will probably need to enter your passwords this first time to authenticate):

ssh-copy-id -i ~/.ssh/server_root server
ssh-copy-id -i ~/.ssh/cloud_storage storage

Verify that your SSH keys and config work by trying to log on to each:

ssh server      ## You should be logged on automatically without a password prompt to the server

ssh storage     ## You should be logged on automatically without a password prompt to the storagebox

On the server, configure the ~/.ssh/config for the storagebox. The example below is for my Hetzner storagebox, but your configuration may be different

Host storagebox
    User <REDACTED>
    Hostname <REDACTED>.your-storagebox.de
    Port 23

Notice that I did not need or want to configure SSH IdentityFile for the storagebox on the server. This is automatically handled by the SSH Agent when I connect from my media box at home

Protecting SSH

If you have ever had a server hosted with a public IP address on the Internet, you have probably noticed the neverending attempts to brute-force your SSH server from a myriad of IP addresses. To mitigate this, I have set up my SSH service to only be accessible when you connect via VPN to the server. I will detail how I configured OpenVPN in another post. Since I am running a RedHat derivative (CentOS) I used firewalld to add the VPN interface to the trusted zone and only allow access to the SSH service from that zone.

Scripting The Backups

The final step is to create a script which can execute the backups on a schedule. I built an Ansible playbook to do this. This is a relatively short playbook and I added some more security to my process while I built it.

Create an inventory file

yaml

all:
  hosts:
    cloud:
      ansible_host: hetznerroot

Pretty simple. The one thing which may not be obvious is that the cloud host is not a valid FQDN and ansible_host is not either. Instead, the ansible_host is referencing a configuration in my ~/.ssh/config file. That configuration handles things like port forwarding, agent forwarding, etc...

Create the playbook

yaml

---
- name: Back Up Hetzner Cloud Server
  hosts: cloud
  gather_facts: false

  pre_tasks:
    - name: Verify timestamp value is set
      ansible.builtin.assert:
        that:
          - timestamp is defined
        fail_msg: "The timestamp value MUST be set"
        quiet: true
      delegate_to: localhost

    - name: Verify borg passphrase value is set
      ansible.builtin.assert:
        that:
          - borg_passphrase is defined
        fail_msg: "The borg_passphrase value MUST be set"
        quiet: true
      delegate_to: localhost

    - name: Verify borg host value is set
      ansible.builtin.assert:
        that:
          - borg_host is defined
        fail_msg: "The borg_host value MUST be set"
        quiet: true
      delegate_to: localhost

    - name: Connect to Hetzner VPN
      community.general.nmcli:
        state: up
        conn_name: hetzner
      register: vpn_up
      failed_when: vpn_up.state != 'up'
      delegate_to: localhost

  post_tasks:
    - name: Disconnect from Hetzner VPN
      community.general.nmcli:
        state: down
        conn_name: hetzner
      delegate_to: localhost

  tasks:
    - name: Dump PostgreSQL databases
      containers.podman.podman_container_exec:
        name: "{{ item.name }}-postgresql"
        command: >
          bash -c '
          export PGPASSWORD={{ item.name }};
          pg_dumpall -h 127.0.0.1 -f /tmp/db-{{ item.name }}-{{ timestamp }}.sql --username={{ item.name }} -w'
      loop:
        - { name: "onedev" }
        - { name: "keycloak" }

    - name: Copy DB script to host
      containers.podman.podman_container_copy:
        container: "{{ item.name }}-postgresql"
        src: /tmp/db-{{ item.name }}-{{ timestamp }}.sql
        dest: /opt/{{ item.name }}/db-{{ item.name }}-{{ timestamp }}.sql
        from_container: true
      loop:
        - { name: "onedev" }
        - { name: "keycloak" }

    - name: Delete PostgreSQL dump from inside of the container
      containers.podman.podman_container_exec:
        name: "{{ item.name }}-postgresql"
        command: >
          rm -f /tmp/db-{{ item.name }}-{{ timestamp }}.sql
      loop:
        - { name: "onedev" }
        - { name: "keycloak" }

    - name: Perform Borg Backup
      ansible.builtin.shell:
        cmd: >
          export BORG_PASSPHRASE="{{ borg_passphrase }}";
          borg create --debug --progress --exclude-from=/root/exclude_list.txt {{ borg_host }}::{{ timestamp }} /
      changed_when: true

And I run this from a script which sets up my ssh-agent as follows:

bash

#!/bin/bash

set -x

eval $(ssh-agent)

export WEEKDAY=$(date +%A)
export TIMESTAMP=$(+%y-%m-%dT%H-%M-%S)
export BORG_HOST="ssh://storagebox/home/borg"  ## Local cloud backup

if [[ "${WEEKDAY}" == "Saturday" ]]; then
  export BORG_HOST="ssh://media/Media/borg"    ## Offsite backup
fi

ssh-add ~/.ssh/december_2026

cd /path/to/playbook_and_inventory
ansible-playbook -i inventory.yaml -e borg_passphrase=$1 -e borg_host=${BORG_HOST} -e timestamp=${TIMESTAMP} backup.yaml

kill -QUIT ${SSH_AGENT_PID}

unset SSH_AGENT_PID
unset SSH_AUTH_SOCK

Results

As of right now, this is working well and my nightly (and weekly) backups are VERY fast. On a typical day it takes less than 5 minutes.

Excluding Files/Directories

You probably do not want to back up EVERYTHING on your server. There are lots of files in cache directories or elsewhere that you just do not need to keep if you have a catastrophic failure. For the observant among you, you may have seen the --exclude-from=/root/exclude_list.txt in my borg create command from the playbook. That exclude_list.txt file has a directories/filenames, one per line, which will be excluded from your backup. Keep in mind, that these are recursive, so if you exclude /home you will not backup /home/user either. Be sure that you want to exclude the whole directory and all files and directories it contains.

Final Thoughts

Is my solution the best solution? Almost certainly not. Is it the best solution for you? Probably not... That said, it is ONE way you could approach the problem and perhaps it will provide you with inspiration for how you can handle your backups. It works well for me, it's Open Source, and it meets all of the design criteria I needed for a backup solution. Your needs probably differ from mine.

Want to complain about how bad my solution is? Follow me on Mastodon and send me all of the messages telling me how badly I screwed this up and perhaps I can learn a few new tricks along the way! 😛

A Ransomware Resistant Backup Strategy With Borg ​

Setting The Stage ​

Choosing A Backup Application ​

Backup Requirements ​

Borg Basics ​

Daily Backups ​

Weekly Off-Site Backups ​

Preparing Your SSH Keys ​

Protecting SSH ​

Scripting The Backups ​

Create an inventory file ​

Create the playbook ​

Results ​

Excluding Files/Directories ​

Final Thoughts ​