Using rdfind To Deduplicate Obsidian Dropbox Backups

EDIT 2022-03-01: The following still works but I can’t recommend using the aut-o-backups plugin. Its lack of features and configurability really makes it unusable for any real-world use-cases.

This might be of interest for anyone who is using the aut-o-backups plugin to automatically backup their Obsidian vault to Dropbox. The plugin is intentionally leaving out a lot of the complexity that is normally involved in dealing with backups, like pruning (only keep x backups) and deduplication. This post is about the latter.

The tool rdfind can be used to deduplicate the backups and in so doing safe A LOT of storage (especially if you have included binary files like images).

Dropbox is able to follow symlinks (or soft links) so it’s possible to deduplicate the created backups by converting duplicate files into symlinks pointing to only one file.

The following has been tested and is used on a Mac.

(Please make a backup of your backups before attempting any of this!)

  1. Open the terminal
  2. Install rdfind with Homebrew if you haven’t done so already: brew install rdfind
  3. Navigate to your backups folder: cd ~/Dropbox/Apps/Obsidian\ Backups
  4. Run rdfind: rdfind -makesymlinks true .

You will get output looking similar to this:

rdfind -makesymlinks true .
Now scanning ".", found 2475 files.
Now have 2475 files in total.
Removed 0 files due to nonunique device and inode.
Total size is 467948959 bytes or 446 MiB
Removed 13 files due to unique sizes from list. 2462 files left.
Now eliminating candidates based on first bytes: removed 10 files from list. 2452 files left.
Now eliminating candidates based on last bytes: removed 0 files from list. 2452 files left.
Now eliminating candidates based on sha1 checksum: removed 2 files from list. 2450 files left.
It seems like you have 2450 files that are not unique
Totally, 223 MiB can be reduced.
Now making results file results.txt
Now making symbolic links. creating 
Making 1406 links.

This can be run on regular intervals (once an hour or so) to deduplicate your backups. I use Keyboard Maestro for this:

Contents