In this article, I would like to show you how to prepare the script that will back up your important files to the cloud storage using a tool called restic. Even though the article is specific for macOS and Backblaze B2, you can use some techniques for creating a similar script for Linux systems.

Disclaimer

I’m not going to give you a ready-to-use script or working solution. I know nothing about your needs, your files, limitations of your network, budget, and other things that could have an impact on the backup workflow. You should always adjust the tools and the workflow to your needs.

Instead, I’ll show you my process of building the backup script that works for me. I use a word process, because it’s not a single task. Trying to build something perfect first off will lead to a product with unnecessary features. Start with the minimal script and change when you notice that you need something more.

Requirements

In this article, I’m going to use some tools and services and I assume, that you have basic knowledge of them. If not, don’t worry. I’ll describe how to use them in the context of creating the backup script. If you want to know more, please check the documentation of these tools.

  • Restic – a command-line tool for creating backups and restoring files.
  • Backblaze B2 – Cloud storage (you can use a different provider).
  • launchd – macOS’s system service management (you can use cron).
  • macOS keychain – secure way to store secrets.
  • Bash – I’m going to write the script in bash.

Preparing environment

Make your file structure backup-friendly

Before you start, I recommend you to clean up the disk a little bit by separating stuff you care from the clutter you don’t need. Are you need to backup your entire disk? Do you want to pay for storing all these downloaded stuff?

Keep files you want to backup and non-important files separate.

Maintaining a good and solid file structure may help you also during restoring files from snapshots. If you have a dedicated place on the disk for stuff you care, you’ll never have a problem with finding the file you lost or need.

Thanks to restic and your perseverance to organize files, you’ll add extra dimension of your disk – time. It means you are not limited to browse files only within directories, but also in different moments in time.

Install restic

First, you need to install restic. In macOS, the easiest way to do is to using homebrew.

> brew install restic

After installation, you can run the program to see available commands.

> restic

restic is a backup program which allows saving multiple revisions of files and
directories in an encrypted repository stored on different backends.

Usage:
  restic [command]

Available Commands:
  backup        Create a new backup of files and/or directories
  cache         Operate on local cache directories
  cat           Print internal objects to stdout
  ...

Initialize restic repository in Backblaze B2

This step depends on where you want to store your repository. I use the Backblaze B2 service, but you may use different storage. Here is a full list of supported storage providers.

Create Backblaze’s B2 Bucket

Create a new private bucket called e.g. restic-backup-2019. Then, In the App Keys section, add a new application key called e.g. macos-restic-backup with read and write access to this newly created bucket.

Use keys to initialize the repository

Restic reads Backblaze credentials from environment variables. Let’s export them:

> export B2_ACCOUNT_ID=<keyID visible on app keys page>
> export B2_ACCOUNT_KEY=<applicationKey displayed after app key generation>

Now, you can initialize the new repository using restic init command.

> restic init -r b2:restic-backup-2019:backup
enter password for new backend:
enter password again:

The -r flag stands from repository. The prefix (b2) points to the provider, whereas suffix is a path to the repository within the bucket (/backup).

Restic will ask you to enter a password for the new repository. Set the strong password and store it in a safe place, e.g. in a password manager such as KeePass or Keychain.

Create your first backup

Once the repository is initialized, you may create your first backup. To see how restic works, let’s do it by hand. This command creates a new snapshot of ~/Documents directory in the repository.

> /usr/local/bin/restic backup -r b2:restic-backup-2019:backup --verbose ~/Documents

First, restic will ask you for the password to the repository. If you want to automate this process, you have to provide the password differently.

Restic has two options for retrieving the password to the repository.

  • Using file – you can store the password in the file.
  • Using command – you can specify a shell command to get the password.

Retrieving a password using command sounds better for me because I can store the password securely using macOS Keychain. Then, I can obtain it using security, which is a command-line interface for the Keychain. Let’s do this way.

Store secrets in macOS keychain

You can add new entries to the keychain using the graphical interface or by the security tool. In the terminal, add a new password using the security add-generic-password command.

> security add-generic-password -s backup-restic-repository -a restic_backup -w
password data for new item:
retype password for new item:

To retrieve the password from the keychain, use the security find-generic-password command:

> security find-generic-password -s backup-restic-password-repository -w
the_secret_password_to_secure_my_backups

It’s reasonable to store other secrets in Keychain as well, so let’s add them.

> security add-generic-password -s backup-restic-repository -a restic_backup -w
> security add-generic-password -s backup-restic-b2-accound-id -a restic_backup -w
> security add-generic-password -s backup-restic-b2-account-key -a restic_backup -w

Prepare script to automate backup

Use restic in script

Let’s use secrets from Keychain to run the first backup using the dedicated script.

#!/bin/bash

export B2_ACCOUNT_ID=$(security find-generic-password -s backup-restic-b2-accound-id -w)
export B2_ACCOUNT_KEY=$(security find-generic-password -s backup-restic-b2-account-key -w)
export RESTIC_REPOSITORY=$(security find-generic-password -s backup-restic-repository -w)
export RESTIC_PASSWORD_COMMAND='security find-generic-password -s backup-restic-password-repository -w'

/usr/local/bin/restic backup --verbose -o b2.connections=20 ~/Documents

As you can see, I populate environment variables that restic uses by secrets obtained from Keychain. Only the RESTIC_PASSWORD_COMMAND variable is a string with the command because restic will run it when the password is needed.

The extra -o b2.connections=20 flag is a specific option for the Backblaze B2 backend. You can omit this if you use a different storage provider.

Save it as a backup.sh using your favorite text editor and make it executable.

> chmod +x backup.sh
> ./backup.sh

If you used a GUI to add secrets to the keychain, you should allow security obtaining passwords. Use Always Allow to avoid this question in the future.

Specify the list of directories to backing up

You can type directories or even single files directly in the restic backup command, however, if you have lots of them, the command will be hard to read and manage. Unless you don’t back up the whole home directory, it’s useful to prepare a file containing directories to backing up.

Create a new file called backup.txt with a list of directories and files you want to include in backups. Here’s an example file. All the paths are relative to the user’s home directory.

> cat ~/backup.txt
Desktop/
Documents/
Movies/
Music/
Photos/
.config/
.ssh/

Now, change the backup command to use this file. I assume, that this file is located in the home directory.

/usr/local/bin/restic backup --verbose -o b2.connections=20 --files-from ~/backup.txt

Avoid running backup twice at the same time

Because you’re preparing the automatic script, it’s worth to add some checks to prevent malfunctions, e.g. invoking script when another backup is in progress.

You can use a simple approach with the PID file in the home directory. The script could check if there is any backup process running. If so, there is no need to perform a backup, so you can finish the execution. Otherwise, the script creates a PID file and removes it once the backup is done.

PID_FILE=~/.restic_backup.pid

if [ -f "$PID_FILE" ]; then
    echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist. Probably backup is already in progress."
    exit 1
fi;

echo $$ > $PID_FILE
# restic execution
rm $PID_FILE

However, it was not enough. Sometimes, because of logouts, computer restarts, or other errors, the script won’t reach the rm command and the next scheduled backup had no chance to start. You can change the approach a bit and add an extra condition to check if this specific process exists. Let’s add a date and time to the output to provide more context.

PID_FILE=~/.restic_backup.pid

if [ -f "$PID_FILE" ]; then
  if ps -p $(cat $PID_FILE) > /dev/null; then
    echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist. Probably backup is already in progress."
    exit 1
  else
    echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist but process " $(cat $PID_FILE) " not found. Removing PID file."
    rm $PID_FILE
  fi
fi

echo $$ > $PID_FILE
# restic execution
rm $PID_FILE

Create a backup from trusted networks

For some reason, you may don’t want to perform backups from public or untrusted networks or these created as personal mobile hot-spot, because of the limited bandwidth. In that case, it’s worth to check the active connection and allow executing script only if a computer is connected to the allowed network.

For backup purposes, you may use your home and work Wi-Fi network. To check the current Wi-Fi on macOS, you can use the networksetup -getairportnetwork en0 command.

if [[ $(networksetup -getairportnetwork en0 | grep -E "Home-Network|Work-Network") == "" ]]; then
  echo $(date +"%Y-%m-%d %T") "Unsupported network."
  exit 3
fi

By using grep with regex option you can check if the output contains one of the allowed Wi-Fi names. If not, the script exits with status code 3.

I use different status codes to distinguish why the script ends – it may be useful in future automation.

Create backup only if the computer is plugged in to power source

Backup may be a power-demanding operation, especially if restic needs to perform a full scan of files and send lots of them to the cloud. To avoid battery draining, the script should do backups only if the computer is connected to the stable power source.

You can check if your mac uses power from the battery using pmset -g ps.

if [[ $(pmset -g ps | head -1) =~ "Battery" ]]; then
  echo $(date +"%Y-%m-%d %T") "Computer is not connected to the power source."
  exit 4
fi

Schedule automatic backup

The script is almost done so you can schedule the backup by defining a launchd job. You can also use cron, but launchd is preferred way to schedule actions in macOS.

Let’s take a look at how we can approach this.

The problem with launchd

Launchd has an interesting property called StartInteval that lets you create an agent that runs a task every n seconds. Moreover, it counts time also when the computer is asleep.

<key>StartInterval</key>
<integer>300</integer>

Nevertheless, the script has some extra conditions that may break the execution earlier – this is a variation of the return early approach. Despite that, launchd get the information that script was executed and it will schedule the next execution and you end up with no backup.

I’ve experienced also another issue related to the StartInterval property, especially if the value is greater than a few hours. The job will be executed after an unpredictable time, sometimes much later than the interval value. For short intervals, like 5 minutes, everything works like a charm.

Backing up files every 5 minutes is too aggressive and surely not needed, because it leads to many snapshots of data with small differences.

I’m going to show you how to add this kind of logic into the backup script.

Add time condition to your script

First, you need to have an additional place to store the timestamp. Let’s keep it simple and use the file as we do with the PID file.

TIMESTAMP_FILE=~/.restic_backup_timestamp

This file will contain the timestamp of date in the future, but for now, it doesn’t even exist. Don’t care about it. In the script, you can add a condition to check if the timestamp file exists and if so, you may check if the current timestamp is less than the value from the file. If it’s true, then the minimum threshold hasn’t been reached and there is no need to back up – it’s too early. Otherwise, we can go further.

if [ -f "$TIMESTAMP_FILE" ]; then
  time_run=$(cat "$TIMESTAMP_FILE")
  current_time=$(date +"%s")

  if [ "$current_time" -lt "$time_run" ]; then
    exit 2
  fi
fi

You have to also set this threshold after the backup command.

echo $(date -v +6H +"%s") > $TIMESTAMP_FILE

Above statement adds 6 hours to the current time and stores the timestamp to the $TIMESTAMP_FILE. If you want to change the frequency of backup, you can replace 6H by a different value in the date command.

Create a launchd job

Now, you have to create a new property list file. In my case, the name is pl.skrajewski.restic_backup.plist, with is the same name as the label for the job. This name follows the reverse domain name notation, which is a simple way to categorize and sort not only jobs, but also packages, components and other stuff.

My plist file looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC -//Apple Computer//DTD PLIST 1.0//EN
http://www.apple.com/DTDs/PropertyList-1.0.dtd>
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>pl.skrajewski.restic_backup</string>
    
    <key>Program</key>
    <string>/Users/szymon/scripts/backup.sh</string>

    <key>RunAtLoad</key>
    <true/>

    <key>StartInterval</key>
    <integer>300</integer>
    
    <key>WorkingDirectory</key>
    <string>/Users/szymon/</string>
    
    <key>StandardOutPath</key>
    <string>/Users/szymon/Library/Logs/pl.skrajewski.restic_backup.out.log</string>

    <key>StandardErrorPath</key>
    <string>/Users/szymon/Library/Logs/pl.skrajewski.restic_backup.error.log</string>
</dict>
</plist>

Because I want to run this job as a user, I’m going to register it as an agent. All you have to do is to copy this file to the ~/Library/LaunchAgents/ directory and load the job using launchctl.

> cp pl.skrajewski.restic_backup.plist ~/Library/LaunchAgents
> launchctl load ~/Library/LaunchAgents/pl.skrajewski.restic_backup.plist

Once the job is loaded, the first backup will be started. Launchd also loads jobs when a user logs in. If you want to know how it works, you can read the The launchd Startup Process section in the documentation.

Full backup script

There is a full backup.sh script. It writes the output will to StandardOutPath and StandardErrorPath files. You can preview them using your terminal or using the Console application.

This file is also available as GitHub’s Gist.

#!/bin/bash

PID_FILE=~/.restic_backup.pid
TIMESTAMP_FILE=~/.restic_backup_timestamp

if [ -f "$PID_FILE" ]; then
  if ps -p $(cat $PID_FILE) > /dev/null; then
    echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist. Probably backup is already in progress."
    exit 1
  else
    echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist but process " $(cat $PID_FILE) " not found. Removing PID file."
    rm $PID_FILE
  fi
fi

if [ -f "$TIMESTAMP_FILE" ]; then
  time_run=$(cat "$TIMESTAMP_FILE")
  current_time=$(date +"%s")

  if [ "$current_time" -lt "$time_run" ]; then
    exit 2
  fi
fi

if [[ $(networksetup -getairportnetwork en0 | grep -E "Home-Network|Work-Network") == "" ]]; then
  echo $(date +"%Y-%m-%d %T") "Unsupported network."
  exit 3
fi

if [[ $(pmset -g ps | head -1) =~ "Battery" ]]; then
  echo $(date +"%Y-%m-%d %T") "Computer is not connected to the power source."
  exit 4
fi

echo $$ > $PID_FILE
echo $(date +"%Y-%m-%d %T") "Backup start"

export B2_ACCOUNT_ID=$(security find-generic-password -s backup-restic-b2-accound-id -w)
export B2_ACCOUNT_KEY=$(security find-generic-password -s backup-restic-b2-account-key -w)
export RESTIC_REPOSITORY=$(security find-generic-password -s backup-restic-repository -w)
export RESTIC_PASSWORD_COMMAND='security find-generic-password -s backup-restic-password-repository -w'

/usr/local/bin/restic backup --verbose -o b2.connections=20 --files-from ~/backup.txt 

echo $(date +"%Y-%m-%d %T") "Backup finished"
echo $(date -v +6H +"%s") > $TIMESTAMP_FILE

rm $PID_FILE

MacOS Catalina: Dealing with System Integrity Protection (SIP) and new permissions

Before you start
If you don’t want to use the extra binary to handle specific permissions, you can add your shell (e.g. bash or zsh) to Full Disk Access list in Security and Privacy panel.

After upgrading the system to macOS Catalina 10.15, my backup script stopped working. The error log had error operation not permitted:

scan: Open: open /Users/szymon/Desktop: operation not permitted
scan: Open: open /Users/szymon/Documents: operation not permitted
scan: Open: open /Users/szymon/Pictures/Photos.photoslibrary: operation not permitted

According to Catalina’s landing page, Apple improved the security of the system by using read-only volume for system data and adding more permissions that applications have to ask for.

Data protections.

Apps must now get your permission before directly accessing files in your Documents and Desktop folders, iCloud Drive, and external volumes, so you’re always in control of your data. And you’ll be prompted before any app can capture keyboard activity or a photo or video of your screen.

https://www.apple.com/macos/catalina/

Unfortunately, the script didn’t ask me for permission for any directory or functionality. I tried to add the Full Disk Access to the backup script, restic, bash, and terminal, but I still got the same outcome.

After reading some discussion on GitHub, I found out that wrapping script in the binary may help solve this issue. I don’t know the underlying mechanism yet, but when you run the binary, it asks the user for permission to requested resources.

I wrote a small console application in Swift that runs my backup script.

import Foundation
import os

let task = Process()

task.launchPath = "/bin/bash"
task.arguments = ["/Users/szymon/scripts/backup.sh"]

do{
    try task.run()
}
catch{
    os_log("error")
}

task.waitUntilExit()

There is a security problem: the intruder may replace the content of backup.sh and run the custom script to gain extra privileges.

After compilation, I got the binary backup_wrapper. I put it in the same place as my original backup script and I changed the path to this executable in pl.skrajewski.restic_backup.plist.

<key>Program</key>
<string>/Users/szymon/scripts/backup_wrapper</string>

When I run the binary, it asked me for permission to resources. I didn’t have to add Full Disk Access to the binary, restic nor bash.

macOS privacy panel showing that binary backup_wrapper has access to the Files and Folders.
backup_wrapper binary with the permission to Files and Folder.
macOS privacy panel showing that binary backup_wrapper has access to the Photos.
backup_wrapper binary with the permission to Photos.

It’s still a lot of hassle to set up the whole backup process. If you work with bash scripts, there is a big chance that you’ll get in trouble with permissions. I would like to hear other opinions about this mechanism in Catalina and how to deal with it. If you already solve this issue in another way, please let me know.

Summary

As you can see, by crafting things for your needs you don’t have to start with something big. Start small, and change it in time. Add only things that you need to have, not everything that comes to your mind. Otherwise, you will end up with a bloated script with tons of unnecessary features.

I would implement this as a separate piece of software, written e.g. in JavaScript, using ready-to-use modules. But I didn’t. Things have to be simple, because simple things are cheap, easy to use and maintain.

There are some extra features I have implemented in my backup workflow but I didn’t include in the article:

  • Anybar integration to check if the backup is already in progress.
  • macOS notifications – I consider to get rid of them because the workflow is stable enough and I don’t need to have feedback if the backup has started.
  • Integration with healthchecks.io to get feedback if something is wrong with the workflow.

I recommend you to invest some time to set up an automatic backup workflow. It’s like the insurance – you may even don’t need it. But if something happens, it’s better to have it. And remember to check from time to time if your backup is consistent and restorable. Or maybe should we automate it as well?


Changelog

Update 2019-11-28:

I added information, how I deal with the new permission system aka “entitlement”. It complicated this set up process a bit because it requires to compile custom binary, that runs the backup script.

Update 2021-03-07:

As readers confirm, there is no need to run the backup script script using the extra binary wrapper if you grant Full Disk Access to the shell you use with the script (bash or zsh). I added a proper info to the article in the section regarding custom binary wrapper.


Featured photo by kropekkpl from Pixabay.