How to Back Up Your Files with rsync, tar, cron, and GPG
I wrote this originally in December 2011. I was a software architect at a domain registrar, enthusiastic about Unix tools, and wanted a backup script that didn't depend on Time Machine or Dropbox. The article got some traction. The script worked.
Fifteen years later, I dug it up, cringed a little, and rewrote it.
The four tools are the same: rsync, tar, cron, and gpg. The strategy is the same. What changed is everything around it — the things you learn only by shipping code for long enough to watch it fail in ways you didn't anticipate.
This is that rewrite. The script is on GitHub and runs on any POSIX system — macOS, Linux, FreeBSD. The original only ran on macOS. That's one of the first things I fixed.
The Strategy
Still the same four-layer approach:
- rsync — Incremental snapshot with hard links (only changed files consume space)
- tar — Compressed point-in-time archives for older snapshots
- gpg — Encryption before anything goes offsite
- cron — None of this runs manually
What's new is a rotation tier: daily archives roll into weekly, weekly into monthly. In 2011 I archived daily and called it done. That's fine until you need something from three months ago and realise you've been overwriting it.
What I Got Wrong in 2011
The script had no error handling
The original script would happily continue if rsync failed. If the snapshot silently broke, you'd run tar on an empty directory and encrypt that. You'd find out when you needed to restore — the worst possible moment.
The fix is two characters:
set -eu
-e exits immediately if any command returns non-zero. -u treats unset variables as errors instead of silently expanding to empty strings. Always use this. I didn't in 2011 because I didn't know better.
The date command only worked on macOS
The original used:
YESTERDAY=$(date -v -1d +%Y%m%d)
That's BSD date syntax — macOS. On Linux (GNU date), it's:
YESTERDAY=$(date -d "1 day ago" +%Y%m%d)
Completely different. The script silently broke on any Linux machine. The current version detects which variant is installed at runtime and branches accordingly:
date_subtract() {
_fmt="$1" _unit="$2" _n="$3"
if date -d "now" +%s >/dev/null 2>&1; then
# GNU date (Linux)
case "$_unit" in
d) date -d "$_n day ago" +"$_fmt" ;;
m) date -d "$_n month ago" +"$_fmt" ;;
esac
else
# BSD date (macOS)
case "$_unit" in
d) date -v "-${_n}d" +"$_fmt" ;;
m) date -v "-${_n}m" +"$_fmt" ;;
esac
fi
}
The detection trick: GNU date accepts -d; BSD date does not. Test once, branch cleanly.
Hardcoded paths
The original script had paths baked in: my username, my email, my drive path. It was a personal script, so it worked for me. But it made sharing or reusing it on another machine a manual find-and-replace exercise.
The current version uses a config file (backup.conf) with sane defaults:
BACKUP_SOURCE_DIR="${BACKUP_SOURCE_DIR:-$HOME/Documents}"
BACKUP_HOME="${BACKUP_HOME:-$HOME/backups}"
GPG_RECIPIENT="${GPG_RECIPIENT:-}" # empty = no encryption
${VAR:-default} is a POSIX parameter expansion — if VAR is unset or empty, use the default. No external commands, no conditionals.
The Core: rsync with Hard Links
This part hasn't changed much, because it was already right.
rsync -aH --link-dest="$CURRENT_LINK" "$BACKUP_SOURCE_DIR" "$SNAPSHOT_DIR/$NOW"
--link-dest is the key. For every file that hasn't changed since the last snapshot, rsync creates a hard link instead of a copy. The result: a snapshot that looks like a full copy but consumes almost no additional disk space.
$ du -sch backups/snapshots/*
143M snapshots/202602211400
16K snapshots/202602211430
16K snapshots/202602211500
178M total
Three snapshots. One day's worth of actual data.
One caveat worth knowing: hard links do not work on NTFS or FAT filesystems (Samba mounts, Windows drives). rsync falls back to full copies silently. Don't let this surprise you at the worst moment.
After each snapshot, update the current symlink to point to the latest:
ln -snf "$LATEST" "$CURRENT_LINK"
The -n flag matters here. Without it, ln follows the existing symlink if it points to a directory and nests the new link inside the old snapshot. With -n, it replaces the symlink. One character, surprising behaviour if you miss it.
Archiving and Rotation
Snapshots are uncompressed and browsable — easy to pull a file from an hour ago. Archives are compressed and encrypted — for longer-term storage.
The current pipeline:
- Snapshots older than today → daily
.tar.gzarchive - Daily archives older than one week → weekly archive
- Weekly archives older than one month → monthly archive
$BACKUP_HOME/
├── current -> snapshots/202602211430
├── snapshots/
│ ├── 202602211400/
│ └── 202602211430/
└── archives/
├── daily/
│ └── 20260220.tar.gz.gpg
├── weekly/
│ └── 202601.WK_2.tar.gz.gpg
└── monthly/
└── 202512.tar.gz.gpg
Creating an archive from yesterday's snapshots:
if [ $(ls -d "$SNAPSHOT_DIR/$YESTERDAY"* 2>/dev/null | wc -l) != "0" ]; then
tar -czf "$ARCHIVES_DIR/$YESTERDAY.tar.gz" \
"$SNAPSHOT_DIR/$YESTERDAY"* \
&& rm -rf "$SNAPSHOT_DIR/$YESTERDAY"*
fi
The && before rm is not optional. Delete only if the archive succeeded.
Encryption with GPG
Anything leaving the machine gets encrypted. This part is unchanged from 2011, except for one addition: --batch --yes for non-interactive execution in cron.
gpg --batch --yes -r "$GPG_RECIPIENT" --encrypt-files archive.tar.gz
Without --batch --yes, GPG prompts for confirmation in some situations. A cron job can't answer prompts. It silently fails, and you find out later.
If you don't have an asymmetric key set up, symmetric encryption is fine for local/single-user backups:
gpg --symmetric --cipher-algo AES256 archive.tar.gz
To decrypt when you need it:
gpg --decrypt-files archive.tar.gz.gpg
Scheduling with cron
crontab -e
Every 30 minutes during work hours on weekdays:
*/30 8-18 * * 1-5 /path/to/backup.sh
The % character is special in crontab — it's treated as a newline. If your command contains % (common in date format strings), escape them as \%.
Testing
In 2011, I tested the script by running it and checking the output. That's how most scripts get "tested."
The current version has a POSIX shell test suite that covers the portability helpers, configuration loading, and end-to-end pipeline — including hard-link deduplication. It runs on both Ubuntu (GNU) and macOS (BSD) via GitHub Actions, plus ShellCheck static analysis.
This isn't because backup scripts need sophisticated tests. It's because scripts that run unattended at 2 AM, on a machine you're not watching, need to be trusted. Tests are how you build that trust before you need it.
Restoring
Snapshots are plain directories — just copy what you need:
cp ~/backups/current/Documents/report.txt ~/Documents/
From an encrypted archive:
gpg --decrypt-files ~/backups/archives/daily/20260220.tar.gz.gpg
tar -xzf ~/backups/archives/daily/20260220.tar.gz -C /tmp/restore/
Test this before you need it. A backup you've never restored is a backup you haven't verified.
Final Thoughts
The tools — rsync, tar, cron, gpg — are the same ones I used in 2011. They'll still be here in 2036. That's the thing about Unix fundamentals: they don't deprecate.
What changes is the craft around them. Portability. Error handling. Configuration. Testing. The difference between a script that works on your machine and a script you can trust.
The full source is at github.com/aadlani/backup-manager. Clone it, adapt it, make it yours.