Techniques for troubleshooting backupninja/rdiff-backup

This is a writeup of our internal process for troubleshooting rdiff-backup/backupninja errors. Please note that it's a work in progress!

Common reasons for failure: 

  • The destination server is down. File a ticket with MFPL if it's an MFPL server.
  • You're out of space on the destination disk. You'll get an "Err 28" and "Out of Disk Space" notification in the backup notification.
  • You're out of space in the /tmp folder. Unfortunately, this gives the same error message as being out of space on the destination disk. Check to make sure that /tmp has at least 10% more space than the biggest file you're rdiff-ing.
  • A backup takes more than 24 hours to run. Backupninja, so far as I can tell, takes no steps to figure out that it's still got an rdiff-backup instance running, and running a second simultaneous rdiff-backup clobbers the original. 

Some steps you can take:

  • Google the error message you get in the backupninja e-mail, of course.
  • Try using the "force" option.  Sometimes violence is the only language rdiff-backup understands. Try adding the following line as line 1 to /etc/backup.d/<whatever>.rdiff: options = --force 
  • Do a manual rdiff-backup, without backupninja.  Rdiff-backup can give you some incredibly useful info that it doesn't normally pass to backupninja. To easily construct the correct rdiff-backup command, run "backupninja -tnd", meaning, "do a test run, now, and show debug output". Backupninja will print the correct rdiff-backup command to run.
  • Remove the rdiff-backup-data folder from the server.  This isn't a great plan, because it removes all backups except the latest, but is sometimes necessary. A later version of this document will hopefully correlate specific error messages to when this is a good idea. 
  • Remove the rdiff-backup-data folder and do an rsync. This is a pretty good alternative to simply removing the rdiff-backup-data folder - rsync is a good deal more robust than rdiff-backup, and can often fix problems faster than rdiff-backup alone. Given the similarities in their command line options, you can make the change without too much trouble. Below is an example of an rdiff-backup command I changed to an rsync command. Note that all the changes are at the beginning EXCEPT that rsync expects one colon in the destination path, whereas rdiff-backup expects two!
  • /usr/bin/rdiff-backup --force --terminal-verbosity 5 --print-statistics --exclude '/home/share/On the Money/*' --exclude '/home/share/IT/common/*' --exclude '/home/share/Video Projects - Film Series/*' --exclude='/home/share/Mapping Data/*' --include '/var/spool/cron/crontabs' --include '/var/log' --include '/var/www' --include '/etc' --include '/root' --include '/home' / <username_redacted>@c.backup.mayfirst.org::backups/server2
  • rsync -e ssh -av --progress --exclude '/home/share/On the Money/*' --exclude '/home/share/Video Projects - Film Series/*' --exclude '/home/share/Mapping Data/*' --exclude '/etc/dnscache/*' --exclude '/home/share/IT/common/*' --include '/var/spool/cron/crontabs' --include '/var/log' --include '/var/www' --include '/etc' --include '/root' --include '/home' --exclude '/*' / <username_redacted>@c.backup.mayfirst.org::backups/server2

Add new comment