Incremental madness

Back in the good old days (!) when backups implied some form of magnetic tape, we used to do a full backup once a week, then an incremental backup on a daily basis of all changes since the last backup, be it full or incremental. Hence in the event a restore was required, a full backup would be restored, then each incremental backup since the full backup would be restored on top to reach the most recent restore point.

These days, hard drives are cheap, we tend just to blat out a complete disk image, or just backup known valuable data, which again is generally an entire database at a time.

Here's what I consider to be a relatively neat trick, something that could be thought of as a reverse incremental backup procedure. The idea being that we maintain a folder called current, which is the most recent full copy of a target system, then date/time stamped folders that contain all the files that were found to have changed at that point in time.

HOST="(remote host name)"
APATH="/(local backup path)/${HOST}"
mkdir -p ${APATH}/current
FQDN="${HOST}.(domain name)"
BACK=${APATH}/`date "+%A_%d_%B_%Y_%H"`
EXCL="--exclude=/proc --exclude=/sys --exclude=/tmp"
O1="--force --ignore-errors --delete-excluded ${EXCL}"
O2="--delete --backup --backup-dir=/${BACK}"
export PATH=$PATH:/bin:/usr/bin:/usr/local/bin
nice -n19 rsync ${O1} ${O2} ${O3} root@${FQDN}:/* ${APATH}/current

So the requirements here are a SSH connection to the remote system (that you want to back up) and a copy of rsync on that system. (and on your local system)

The net result is a local folder called current, and a bunch of stamped folders containing older copies of files. A full restore would come from current, and lost or historical files that you may need to recover would be in the stamped folders. For example;

$ sudo du -sh `ls --sort=time`
58M   Thursday_24_September_2015_03
131M  Wednesday_23_September_2015_03
84M   Tuesday_22_September_2015_03
147M  Monday_21_September_2015_03
211M  Sunday_20_September_2015_03
4.0K  Saturday_19_September_2015_12
126M  Thursday_17_September_2015_03
35M   Wednesday_16_September_2015_09
82M   Tuesday_15_September_2015_15
2.8G  current

As you can see, on a real system this is a lot more efficient in terms of space than repeated full images, and it takes a lot less time, and it can be done remotely over a modest broadband connection.

Note ..

(1) In the above code, you would need to set remote host name to be a reference to the server to back up (say "server"), local backup path to be the local path to wherever you want to store the backups (say /vols/backups) and domain name to be the domain on which your host sits (say ""). So in this example, "" would need to resolve to the IP address of the server you want to back up.

(2) You may like to experiment with the O3, you could for example add --sparse --stats if you're going to run this from CRON, of if you're going to run this in the foreground, then try adding --info=progress2 to get an interactive read on the overall progress of the backup.

(3) You can easily add minutes to the timestamp if you want to do backups on an hourly (or more frequent) basis.

If you want to cleanup older incremental backup folders (which unless your name is Google you probably will) all you need to do is add something like this to CRON to fire every day;

find (path) -maxdepth 2 -ctime +(days) -name "*_*" -exec rm -rfv {} \;

Just set days to be the number of days retention you need, and remove the v from -rfv if you don't want the verbose listing of all the files that get expired. Don't forget to get MAILTO in CRON if you want to get an email record of both backups and expires and path needs to be the same as local backup path above.