In this article, I would like to show you how to prepare the script that will back up your important files to the cloud storage using a tool called restic. Even though the article is specific for macOS and Backblaze B2, you can use some techniques for creating a similar script for Linux systems.
I’m not going to give you a ready-to-use script or working solution. I know nothing about your needs, your files, limitations of your network, budget, and other things that could have an impact on the backup workflow. You should always adjust the tools and the workflow to your needs.
Instead, I’ll show you my process of building the backup script that works for me. I use a word process, because it’s not a single task. Trying to build something perfect first off will lead to a product with unnecessary features. Start with the minimal script and change when you notice that you need something more.
In this article, I’m going to use some tools and services and I assume, that you have basic knowledge of them. If not, don’t worry. I’ll describe how to use them in the context of creating the backup script. If you want to know more, please check the documentation of these tools.
- Restic – a command-line tool for creating backups and restoring files.
- Backblaze B2 – Cloud storage (you can use a different provider).
- launchd – macOS’s system service management (you can use cron).
- macOS keychain – secure way to store secrets.
- Bash – I’m going to write the script in bash.
Before you start, I recommend you to clean up the disk a little bit by separating stuff you care from the clutter you don’t need. Are you need to backup your entire disk? Do you want to pay for storing all these downloaded stuff?
Keep files you want to backup and non-important files separate.
Maintaining a good and solid file structure may help you also during restoring files from snapshots. If you have a dedicated place on the disk for stuff you care, you’ll never have a problem with finding the file you lost or need.
Thanks to restic and your perseverance to organize files, you’ll add extra dimension of your disk – time. It means you are not limited to browse files only within directories, but also in different moments in time.
First, you need to install restic. In macOS, the easiest way to do is to using homebrew.
> brew install restic
After installation, you can run the program to see available commands.
> restic restic is a backup program which allows saving multiple revisions of files and directories in an encrypted repository stored on different backends. Usage: restic [command] Available Commands: backup Create a new backup of files and/or directories cache Operate on local cache directories cat Print internal objects to stdout ...
This step depends on where you want to store your repository. I use the Backblaze B2 service, but you may use different storage. Here is a full list of supported storage providers.
Create a new private bucket called e.g.
restic-backup-2019. Then, In the App Keys section, add a new application key called e.g.
macos-restic-backup with read and write access to this newly created bucket.
Restic reads Backblaze credentials from environment variables. Let’s export them:
> export B2_ACCOUNT_ID=<keyID visible on app keys page> > export B2_ACCOUNT_KEY=<applicationKey displayed after app key generation>
Now, you can initialize the new repository using
restic init command.
> restic init -r b2:restic-backup-2019:backup enter password for new backend: enter password again:
-r flag stands from repository. The prefix (b2) points to the provider, whereas suffix is a path to the repository within the bucket (/backup).
Restic will ask you to enter a password for the new repository. Set the strong password and store it in a safe place, e.g. in a password manager such as KeePass or Keychain.
Once the repository is initialized, you may create your first backup. To see how restic works, let’s do it by hand. This command creates a new snapshot of
~/Documents directory in the repository.
> /usr/local/bin/restic backup -r b2:restic-backup-2019:backup --verbose ~/Documents
First, restic will ask you for the password to the repository. If you want to automate this process, you have to provide the password differently.
Restic has two options for retrieving the password to the repository.
- Using file – you can store the password in the file.
- Using command – you can specify a shell command to get the password.
Retrieving a password using command sounds better for me because I can store the password securely using macOS Keychain. Then, I can obtain it using security, which is a command-line interface for the Keychain. Let’s do this way.
You can add new entries to the keychain using the graphical interface or by the security tool. In the terminal, add a new password using the
security add-generic-password command.
> security add-generic-password -s backup-restic-repository -a restic_backup -w password data for new item: retype password for new item:
To retrieve the password from the keychain, use the
security find-generic-password command:
> security find-generic-password -s backup-restic-password-repository -w the_secret_password_to_secure_my_backups
It’s reasonable to store other secrets in Keychain as well, so let’s add them.
> security add-generic-password -s backup-restic-repository -a restic_backup -w > security add-generic-password -s backup-restic-b2-accound-id -a restic_backup -w > security add-generic-password -s backup-restic-b2-account-key -a restic_backup -w
Let’s use secrets from Keychain to run the first backup using the dedicated script.
#!/bin/bash export B2_ACCOUNT_ID=$(security find-generic-password -s backup-restic-b2-accound-id -w) export B2_ACCOUNT_KEY=$(security find-generic-password -s backup-restic-b2-account-key -w) export RESTIC_REPOSITORY=$(security find-generic-password -s backup-restic-repository -w) export RESTIC_PASSWORD_COMMAND='security find-generic-password -s backup-restic-password-repository -w' /usr/local/bin/restic backup --verbose -o b2.connections=20 ~/Documents
As you can see, I populate environment variables that restic uses by secrets obtained from Keychain. Only the
RESTIC_PASSWORD_COMMAND variable is a string with the command because restic will run it when the password is needed.
-o b2.connections=20 flag is a specific option for the Backblaze B2 backend. You can omit this if you use a different storage provider.
Save it as a
backup.sh using your favorite text editor and make it executable.
> chmod +x backup.sh > ./backup.sh
If you used a GUI to add secrets to the keychain, you’ll need to give security to obtain passwords. Use Always Allow to avoid this question in the future.
You can type directories or even single files directly in the
restic backup command, however, if you have lots of them, the command will be hard to read and manage. Unless you don’t back up the whole home directory, it’s useful to prepare a file containing directories to backing up.
Create a new file called
backup.txt with a list of directories and files you want to include in backups. Here’s an example file. All the paths are relative to the user’s home directory.
> cat ~/backup.txt Desktop/ Documents/ Movies/ Music/ Photos/ .config/ .ssh/
Now, change the
backup command to use this file. I assume, that this file is located in the home directory.
/usr/local/bin/restic backup --verbose -o b2.connections=20 --files-from ~/backup.txt
Because you’re preparing the automatic script, it’s worth to add some checks to prevent malfunctions, e.g. invoking script when another backup is in progress.
You can use a simple approach with the PID file in the home directory. The script could check if there is any backup process running. If so, there is no need to perform a backup, so you can finish the execution. Otherwise, the script creates a PID file and removes it once the backup is done.
PID_FILE=~/.restic_backup.pid if [ -f "$PID_FILE" ]; then echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist. Probably backup is already in progress." exit 1 fi; echo $$ > $PID_FILE # restic execution rm $PID_FILE
However, it was not enough. Sometimes, because of logouts, computer restarts, or other errors, the script won’t reach the
rm command and the next scheduled backup had no chance to start. You can change the approach a bit and add an extra condition to check if this specific process exists. Let’s add a date and time to the output to provide more context.
PID_FILE=~/.restic_backup.pid if [ -f "$PID_FILE" ]; then if ps -p $(cat $PID_FILE) > /dev/null; then echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist. Probably backup is already in progress." exit 1 else echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist but process " $(cat $PID_FILE) " not found. Removing PID file." rm $PID_FILE fi fi echo $$ > $PID_FILE # restic execution rm $PID_FILE
For some reason, you may don’t want to perform backups from public or untrusted networks or these created as personal mobile hot-spot, because of the limited bandwidth. In that case, it’s worth to check the active connection and allow executing script only if a computer is connected to the whitelisted network.
For backup purposes, you may use your home and work Wi-Fi network. To check the current Wi-Fi on macOS, you can use the
networksetup -getairportnetwork en0 command.
if [[ $(networksetup -getairportnetwork en0 | grep -E "Home-Network|Work-Network") == "" ]]; then echo $(date +"%Y-%m-%d %T") "Unsupported network." exit 3 fi
By using grep with regex option you can check if the output contains one of the whitelisted Wi-Fi names. If not, the script exits with status code
I use different status codes to distinguish why the script ends – it may be useful in future automation.
Backup may be a power-demanding operation, especially if restic needs to perform a full scan of files and send lots of them to the cloud. To avoid battery draining, the script should do backups only if the computer is connected to the stable power source.
You can check if your mac uses power from the battery using
pmset -g ps.
if [[ $(pmset -g ps | head -1) =~ "Battery" ]]; then echo $(date +"%Y-%m-%d %T") "Computer is not connected to the power source." exit 4 fi
The script is almost done so you can schedule the backup by defining a launchd job. You can also use cron, but launchd is preferred way to schedule actions in macOS.
Let’s take a look at how we can approach this.
Launchd has an interesting property called
StartInteval that lets you create an agent that runs a task every
n seconds. Moreover, it counts time also when the computer is asleep.
Nevertheless, the script has some extra conditions that may break the execution earlier – this is a variation of the return early approach. Despite that, launchd get the information that script was executed and it will schedule the next execution and you end up with no backup.
I’ve experienced also another issue related to the
StartIntervalproperty, especially if the value is greater than a few hours. The job will be executed after an unpredictable time, sometimes much later than the interval value. For short intervals, like 5 minutes, everything works like a charm.
Backing up files every 5 minutes is too aggressive and surely not needed, because it leads to many snapshots of data with small differences.
I’m going to show you how to add this kind of logic into the backup script.
First, you need to have an additional place to store the timestamp. Let’s keep it simple and use the file as we do with the PID file.
This file will contain the timestamp of date in the future, but for now, it doesn’t even exist. Don’t care about it. In the script, you can add a condition to check if the timestamp file exists and if so, you may check if the current timestamp is less than the value from the file. If it’s true, then the minimum threshold hasn’t been reached and there is no need to back up – it’s too early. Otherwise, we can go further.
if [ -f "$TIMESTAMP_FILE" ]; then time_run=$(cat "$TIMESTAMP_FILE") current_time=$(date +"%s") if [ "$current_time" -lt "$time_run" ]; then exit 2 fi fi
You have to also set this threshold after the backup command.
echo $(date -v +6H +"%s") > $TIMESTAMP_FILE
Above statement adds 6 hours to the current time and stores the timestamp to the
$TIMESTAMP_FILE. If you want to change the frequency of backup, you can replace
6H by a different value in the
Now, you have to create a new property list file. In my case, the name is
pl.skrajewski.restic_backup.plist, with is the same name as the label for the job. This name follows the reverse domain name notation convention, which is a simple way to categorize and sort not only jobs, but also packages, components and other stuff.
My plist file looks like this:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE plist PUBLIC -//Apple Computer//DTD PLIST 1.0//EN http://www.apple.com/DTDs/PropertyList-1.0.dtd> <plist version="1.0"> <dict> <key>Label</key> <string>pl.skrajewski.restic_backup</string> <key>Program</key> <string>/Users/szymon/scripts/backup.sh</string> <key>RunAtLoad</key> <true/> <key>StartInterval</key> <integer>300</integer> <key>WorkingDirectory</key> <string>/Users/szymon/</string> <key>StandardOutPath</key> <string>/Users/szymon/Library/Logs/pl.skrajewski.restic_backup.out.log</string> <key>StandardErrorPath</key> <string>/Users/szymon/Library/Logs/pl.skrajewski.restic_backup.error.log</string> </dict> </plist>
Because I want to run this job as a user, I’m going to register it as an agent. All you have to do it’s to copy this file to the
~/Library/LaunchAgents/ directory and load the job using
> cp pl.skrajewski.restic_backup.plist ~/Library/LaunchAgents > launchctl load ~/Library/LaunchAgents/pl.skrajewski.restic_backup.plist
Once the job is loaded, the first backup will be started. Launchd also loads jobs when a user logs in. If you want to know how it works, you can read the The launchd Startup Process section in the documentation.
There is a full
backup.sh script. It writes the output will to
StandardErrorPath files. You can preview them using your terminal or using the Console application.
This file is also available as GitHub’s Gist.
#!/bin/bash PID_FILE=~/.restic_backup.pid TIMESTAMP_FILE=~/.restic_backup_timestamp if [ -f "$PID_FILE" ]; then if ps -p $(cat $PID_FILE) > /dev/null; then echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist. Probably backup is already in progress." exit 1 else echo $(date +"%Y-%m-%d %T") "File $PID_FILE exist but process " $(cat $PID_FILE) " not found. Removing PID file." rm $PID_FILE fi fi if [ -f "$TIMESTAMP_FILE" ]; then time_run=$(cat "$TIMESTAMP_FILE") current_time=$(date +"%s") if [ "$current_time" -lt "$time_run" ]; then exit 2 fi fi if [[ $(networksetup -getairportnetwork en0 | grep -E "Home-Network|Work-Network") == "" ]]; then echo $(date +"%Y-%m-%d %T") "Unsupported network." exit 3 fi if [[ $(pmset -g ps | head -1) =~ "Battery" ]]; then echo $(date +"%Y-%m-%d %T") "Computer is not connected to the power source." exit 4 fi echo $$ > $PID_FILE echo $(date +"%Y-%m-%d %T") "Backup start" export B2_ACCOUNT_ID=$(security find-generic-password -s backup-restic-b2-accound-id -w) export B2_ACCOUNT_KEY=$(security find-generic-password -s backup-restic-b2-account-key -w) export RESTIC_REPOSITORY=$(security find-generic-password -s backup-restic-repository -w) export RESTIC_PASSWORD_COMMAND='security find-generic-password -s backup-restic-password-repository -w' /usr/local/bin/restic backup --verbose -o b2.connections=20 --files-from ~/backup.txt echo $(date +"%Y-%m-%d %T") "Backup finished" echo $(date -v +6H +"%s") > $TIMESTAMP_FILE rm $PID_FILE
As you can see, by crafting things for your needs you don’t have to start with something big. Start small, and change it in time. Add only things that you need to have, not everything that comes to your mind. Otherwise, you will end up with a bloated script with tons of unnecessary features.
There are some extra features I have implemented in my backup workflow but I didn’t include in the article:
- Anybar integration to check if the backup is already in progress.
- macOS notifications – I consider to get rid of them because the workflow is stable enough and I don’t need to have feedback if the backup has started.
- Integration with healthchecks.io to get feedback if something is wrong with the workflow.
I recommend you to invest some time to set up an automatic backup workflow. It’s like the insurance – you may even don’t need it. But if something happens, it’s better to have it. And remember to check from time to time if your backup is consistent and restorable. Or maybe should we automate it as well?