rdiff-snapshot-fs

I use rdiff-backup for backups. It creates a mirror of your data, just like plain old rsync, but also stores a series of reverse diffs that lets you restore historical versions of each file. For the most part it works well, but the interface for browsing and restoring from backups is somewhat clunky:

$ # List all snapshots
$ rdiff-backup -l /backup
Found 3 increments:
    increments.2010-06-01T23:00:00-07:00.dir   Tue Jun  1 23:00:00 2010
    increments.2010-06-02T23:00:00-07:00.dir   Wed Jun  2 23:00:00 2010
    increments.2010-06-03T23:00:00-07:00.dir   Thu Jun  3 23:00:00 2010
Current mirror: Fri Jun  4 23:00:00 2010
$ # List contents of a snapshot
$ # (note, lists specified dir recursively)
$ rdiff-backup --list-at-time 2010-06-01T23:00:00 /backup/foo/bar
foo/bar/one
foo/bar/two
foo/bar/more/three
foo/bar/more/four
$ # Restore a file from backup
$ rdiff-backup -r 2010-06-01T23:00:00 /backup/foo/bar/one ~/one.restore

I wrote rdiff-snapshot-fs, which is an alternative to rdiff-backup's restore interface. It's a virtual filesystem (powered by FUSE) that lets you browse all your snapshots and their contents. Now you can examine each snapshot as if it were backed by a real mirror:

$ ./rdiff-snapshot-fs.py /backup /view
$ ls /view
2010-06-01T23:00:00-07:00
2010-06-02T23:00:00-07:00
2010-06-03T23:00:00-07:00
$ ls /view/2010-06-01T23:00:00-07:00/foo/bar
one
two
more
$ cp /view/2010-06-01T23:00:00-07:00/foo/bar/one ~/one.restore

Some background: another tool, called archfs, provides similar functionality, but it has performance issues, and I never got it to successfully load my backup repository (granted, my homedir backup repo has 400 snapshots and a current mirror size of 450GB). I concur with Jon Dowland's assessment that the problem stems from archfs trying to reconstruct historical snapshot data up front before it's needed. (Many problems in software, as well as in life, are really problems caused by being insufficiently lazy!)

More info about rdiff-snapshot-fs, including how to obtain the source, can be found at http://web.psung.name/rdiff-snapshot-fs.

(By the way, FUSE is really interesting to play with and FUSE-Python makes it a snap to get started. More remarks about this soon, possibly.)

Disclaimer: This is something I cooked up in my spare time. rdiff-snapshot-fs is liable to eat your data and scare your pets. The file sizes and modes that it reports are known to be unreliable under certain circumstances (though the file data itself is generated correctly, as far as I can tell). But, if you know no fear, you may find it somewhat more useful than the standard rdiff-backup interface.