Backup of Btrfs subvolumes with Btrbk

Updated 20 May 2022

Introduction

Btrbk is a backup tool for Btrfs subvolumes, taking advantage of btrfs specific capabilities to create atomic snapshots and transfer them to a target backup storage. The source and target locations are specified in the configuration file, thus allowing you to easily define both simple backup scripts, such as "backup to USB HDD", and complex ones, such as "server receives backups from multiple hosts via SSH, with different storage policies".

Definitions
  • A snapshot is a momentary copy of a Btrfs subsystem, located in the same file system as the subsystem itself.
  • A backup is a backup copy of a Btrfs subvolume resulting from a snapshot. It can be found either on the same file system or on another drive or host.
  • Первичная резервная копия - полная копия снимка подраздела, содержащая все данные оригинала.
    Для создания первичной резервной копии в Btrfs используется подход send/receive: btrfs send формирует поток данных, которые btrfs receive преобразует в раздел с данными.
  • Инкрементальная резервная копия - копия содержащая только изменения от первичной резервной копии или от другой инкрементальной копии.
    Для передачи данных инкрементальной резервной копии для btrfs send необходимо указать между какими снапшотами будут сформированы изменения (базовой и конечный). При этом раздел с резервными копиями должен уже содержать данные базового снапшота.
  • An archive is an additional backup of snapshots. It can be run either from a backup or from a snapshot. Unlike the snapshot and backup storage system, it can do without the common base snapshot, resulting in a full transfer of the subvolume.

Btrbk handles storage organization, snapshots, creation parameters for incremental archives, deletion of obsolete snapshots according to storage policy, transfer of data between hosts. All you have to do is to describe the backup storage policy, create the necessary directories, and set the snapshot/backup storage locations.

Installation

Install Btrbk:

emerge -a app-backup/btrbk

Basic setup

Set up general parameters for backups:

/etc/btrbk/btrbk.conf

# snapshot directory on backed up partition 
snapshot_dir               .snapshots

# snapshot name format <name>.YYYYmmddTHHMM
timestamp_format           long

# zstd compression for backup transfer
stream_compress            zstd

# path to log
transaction_log            /var/log/btrbk.log

# stream buffer size
stream_buffer              256m

Backup policy

Edit the backup policy parameters, as shown below:

/etc/btrbk/btrbk.conf

# the daily backup is the first one after midnight
preserve_hour_of_day       0

# Monday is the first day of week
preserve_day_of_week       monday

# preserve all temporary snapshots for at least one day
snapshot_preserve_min      1d

# preserve 14 latest daily, 8 weekly, 6 monthly, 1 annual snapshots
snapshot_preserve          14d 8w 6m 1y

# do not preserve temporary snapshots
target_preserve_min        no

# preserve 6 latest daily, 4 weekly, 6 monthly, 1 annual snapshots
target_preserve            6d 4w 6m 1y

# preserve all snapshot archive
archive_preserve_min       all
Intervals
  • Hourly backup stands for the first backup in the given hour. Accordingly, if there are backups stamped 20210323T0102, 20210323T0503, and 20210323T2320, then the hourly backup is 20210323T0503~, and the other two will be deleted after the minimum storage period has passed.
  • A daily backup is the first copy from the beginning of the day (or after the hour specified in preserve_hour_of_day). The rest of the copies made on that day are considered hourly or temporary ones.
  • A weekly backup is the daily copy closest to Monday (or the day of the week specified in preserve_day_of_week). Thus, if there is a record recorded on Monday, it is considered a copy of the week. If there is no record from Monday, but there is a backup from Wednesday, then the copy from Wednesday will be considered a weekly copy.

  • A monthly backup is the weekly copy closest to the first day of the month, which means that it is assigned to a weekday, not the first of the month.

  • The yearly backup is the earliest monthly copy of the year.

The general syntax of the storage policy is as follows:

[<hourly>h] [<daily>d] [<weekly>w] [<monthly>m] [<yearly>y]

Where the hourly, daily, weekly, monthly and yearly option defines how many copies should be preserved respectively at an hour, day, week, month or year intervals

Example

Example of policy: 2d 2w 2m at July 30, 2021, to be preserved for 1d min:

Date Time Weekday Backup description Comments
2021/07/3023:20 Friday Latest backup to be deleted August, 1 (as min preservation time expires)
2021/07/3021:20 Friday temporary copy to be deleted August, 1 (as min preservation time expires)
2021/07/3020:20 Friday Current daily backup preserved as specified daily copies
2021/07/2920:20 Thursday last day copy preserved as specified 2d
2021/07/2820:20 Wednesday copy of the day before yesterday preserved as specified 2d
2021/07/2620:20 Monday Current weekly snapshot preserved as specified weekly snapshots
2021/07/2122:15 Wednesday Last week snapshot preserved as specified 2w, Wednesday (if neither Monday nor Tuesday snapshots available)
2021/07/1220:20 Monday snapshot of the week before last preserved as specified 2w
2021/07/0520:20 Monday Current monthly snapshot preserved as specified monthly snapshots
2021/06/0720:20 Monday last month snapshot preserved as specified 2m
2021/05/0320:20 Monday snapshot of the month before last preserved as specified 2m

Snapshots and regular backups may differ. For example, snapshots may be preserved for 5 days, while backups be created monthly. In this case, a snapshot of the corresponding latest backup will be forced to support incremental archives.

All these parameters may be specified either globally for all backups or redefined for specific Btrfs volumes or subvolumes to be backed up.

Local backup

In the example below we explain how to back up the / root partition and the /var/calculate/~ data partition for the current system, to be stored on a separate drive mounted to /mnt/backup/~.

Path configuration

Create snapshot directories in root partition / and in /var/calculate/:

mkdir /.snapshots /var/calculate/.snapshots

Add the

/etc/btrbk/btrbk.conf

# root partition
volume /
  # backup at /mnt/backup/
  target /mnt/backup
  # backup main partition
  subvolume .
    # partition name in snapshot rootfs
    snapshot_name rootfs

# partition /var/calculate
volume /var/calculate
  # backup at /mnt/backup/
  target /mnt/backup
  # create snapshots of main partition
  subvolume .
    # partition name in snapshot calculate
    snapshot_name calculate

Snapshots of the root partition will be created in /.snapshots/, those of the data partition in /var/calculate/.snapshots/, and the corresponding backups in /mnt/backup/.

Backup

To run the full cycle (snapshots, backups, deleting obsolete copies), follow these steps:

btrbk run
--------------------------------------------------------------------------------
Backup Summary (btrbk command line client, version 0.31.1)

    Date:   Wed Mar 24 16:18:55 2021
    Config: /etc/btrbk/btrbk.conf

Legend:
    ===  up-to-date subvolume (source snapshot)
    +++  created subvolume (source snapshot)
    ---  deleted subvolume
    ***  received subvolume (non-incremental)
    >>>  received subvolume (incremental)
--------------------------------------------------------------------------------
/.
+++ /.btrbk_snap/rootfs.20210324T1618
*** /mnt/backup/rootfs.20210324T1618

/var/calculate/.
+++ /var/calculate/.snapshots/calculate.20210324T1618
*** /mnt/backup/calculate.20210324T1618

As you can see in the runtime log, two snapshots were created (/.snapshots/rootfs.20210324T1618/, /var/calculate/.snapshots/calculate.20210324T1618/), and then sent as a complete backup (***) to /mnt/backup/.

Re-running the cycle will only result in creating snapshots, since temporary backups are not preserved according to the storage policy. Start a dry run:

btrbk --dry-run run
--------------------------------------------------------------------------------
Backup Summary (btrbk command line client, version 0.31.1)

    Date:   Wed Mar 24 16:24:27 2021
    Config: /etc/btrbk/btrbk.conf
    Dryrun: YES

Legend:
    ===  up-to-date subvolume (source snapshot)
    +++  created subvolume (source snapshot)
    ---  deleted subvolume
    ***  received subvolume (non-incremental)
    >>>  received subvolume (incremental)
--------------------------------------------------------------------------------
/.
+++ /.snapshots/rootfs.20210324T1624

/var/calculate/.
+++ /var/calculate/.snapshots/calculate.20210324T1624

NOTE: Dryrun was active, none of the operations above were actually executed!

If you run it the next day, it will create both snapshots and backups, incrementally:

btrbk run
--------------------------------------------------------------------------------
Backup Summary (btrbk command line client, version 0.31.1)

    Date:   Thu Mar 25 14:04:36 2021
    Config: /etc/btrbk/btrbk.conf

Legend:
    ===  up-to-date subvolume (source snapshot)
    +++  created subvolume (source snapshot)
    ---  deleted subvolume
    ***  received subvolume (non-incremental)
    >>>  received subvolume (incremental)
--------------------------------------------------------------------------------
/.
+++ /.snapshots/rootfs.20210325T1404
>>> /mnt/backup/rootfs.20210325T1404

/var/calculate/.
+++ /var/calculate/.snapshots/calculate.20210325T1404
>>> /mnt/backup/calculate.20210325T1404

For creating snapshots only, run:

btrbk snapshot

To synchronize snapshots with the backup storage, as well as to delete obsolete copies, follow these steps:

btrbk resume

To remove obsolete snapshots and backups only, run:

btrbk prune

Backup scheduling

To set up scheduled backups, add a daily run of btrbk to cron:

/etc/cron.d/btrbk

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
HOME=/

# run btrbk
0 0 * * *   root    /usr/bin/btrbk run &>/dev/null

Backup configuration for a remote node

Backup server configuration

On your backup server, create a ssh key to connect to a remote computer:

ssh-keygen -b 4096 -f /root/.ssh/btrbk.key
Generating public/private rsa key pair.
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/btrbk.key
Your public key has been saved in /root/.ssh/btrbk.key.pub
The key fingerprint is:
SHA256:MhrRRsoUfCnbm/e2Sdw5sH6y+bp1nq48UVLE9sjpgxs root@backup
The key randomart image is:
+---[RSA 4096]----+
|   .o...     o.  |
|   oo+o       +  |
|    +=o      + + |
|    .o.     . = .|
|    . ooS .  =   |
|     ooo.. +E.o  |
|    .  . .+ =+.. |
|         o+=++ . |
|         .OO+++  |
+----[SHA256]-----+

Important

Do not protect with a password, otherwise you will not be able to connect to the service.

Add the start command and parameters at the beginning of the public key string:

echo -e 'command="/var/calculate/bin/ssh_filter_btrbk.sh --source --delete --info",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding' $(cat /root/.ssh/btrbk.key.pub) > /root/.ssh/btrbk.key.pub

Specify SSH-keyed access:

/etc/btrbk/btrbk.conf

# Specify SSH private key for "ssh://" volumes / targets:
ssh_identity               /root/.ssh/btrbk.key

Add a description of the remote backup routine to the end of the file:

/etc/btrbk/btrbk.conf

# root partition of remote computer
volume ssh://host1.example.org/
  # backup at /mnt/backup
  target /mnt/backup/host1.example.org
  # create snapshots of main partition
  subvolume .
    # partition name in snapshot rootfs
    snapshot_name rootfs

# partition /var/calculate/
volume ssh://host1.example.org/var/calculate
  # backup at /mnt/backup
  target /mnt/backup/host1.example.org
  # create snapshots of main partition
  subvolume .
    # partition name in snapshot calculate
    snapshot_name calculate
  # create LXC snapshots for container calculate
  subvolume lxc/calculate/rootfs
    # link backup lxc/calculate/rootfs with group lxc
    group lxc
    # partition name in snapshot
    snapshot_name lxc.calculate.rootfs

where:

  • host1.example.org is the network name of the computer to be backed up
  • /, /var/calculate/, /var/calculate/lxc/calculate/rootfs/ are subvolumes to be backed up
  • group lxc is a custom group name, to backup by group

Client configuration

Create remote snapshot directories on host1:

mkdir /.snapshots /var/calculate/.snapshots

For remote host1~ to be accessed, copy to it the script to run a limited set of commands needed for creating backups only:

scp /usr/share/btrbk/scripts/ssh_filter_btrbk.sh root@host1:/var/calculate/bin/

Move the previously generated server public key to the remote system:

ssh-copy-id -i /root/.ssh/btrbk.key.pub host1

Running the backup

Run the complete cycle on the backup server:

btrbk run

To run the cycle for host host1.example.org only:

btrbk run host1.example.org

To run the cycle for /var/calculate/ only from host1.example.org:

btrbk run host1.example.org:/var/calculate

To run the cycle for group lxc only:

btrbk run lxc

Miscellaneous

By default, Btrbk tries to use incremental backups, but if the chain is broken (for example, by manually removing redundant backups), a full backup will be performed. This may be unnecessary if the primary backups have already been created, while the subvolumes contain a large amount of data. Tell the system to use incremental backups only:

/etc/btrbk/btrbk.conf

# Perform incremental backups (set to "strict" if you want to prevent
# creation of non-incremental backups if no parent is found).
incremental                strict

Important

Note that using only incremental backups must be set up after the primary backups. Otherwise, Btrbk will not be able to create them.