On Tue, 9 Dec 2008, Keith Lofstrom wrote:
One of my offsite VPN client machines was moved to a slower internet link. Meanwhile, in October it went through a major automated upgrade. As a result, the nightly rsync job took longer than 24 hours. The next night's backup started before the first completed, which attempted to move all those files again, and slowed the link down further. Cascading failures. I was incredibly busy, so I turned off dirvish to that machine rather than fix the problem.
I use Dirvish for >24h backups, too.I use Dirvish from the Debian packages, which have a cron job. I'm attaching the original Debian cron job and my modified one; you'll see my modifications are centered around creating a PID file in /var/lock/dirvish-cronjob ; this allows me to be sure that I match not just any rsync, but the actual dirvish cron job.
A problem with Keith's suggestion is that if any user at all is running rsync, then the dirvish cron job will fail to start.
I hereby permit distribution for this modification under the same terms as Dirvish itself.
My way also has the cron job output something indicating why it did not run overnight, so that you get a nightly email knowing what happened.
-- Asheesh. -- Truthful, adj.: Dumb and illiterate. -- Ambrose Bierce, "The Devil's Dictionary"
#! /bin/sh # # daily cron job for the dirvish package # if [ ! -x /usr/sbin/dirvish-expire ]; then exit 0; fi if [ ! -s /etc/dirvish/master.conf ]; then exit 0; fi mount_check() { mntout=`tempfile -p mount` mount $1 >$mntout 2>&1 if [ ! -d $1/lost+found ]; then # only works for "real" filesystems :-) # (Yes, I know about reiserfs.) echo "'mount $1' failed?! Stopping." echo "mount output:" cat $mntout rm -f $mntout exit 2 fi if stat $1 | grep 'Inode: 2[^0-9]' >/dev/null; then # ditto rm -f $mntout return 0 # ok fi echo "$1 isn't inode 2 ?! Mount must have failed; stopping." echo '' stat $1 echo "mount output:" cat $mntout rm -f $mntout umount $1 exit 2 } ## Example of how to mount and umount a backup partition... # mount_check /backup /usr/sbin/dirvish-expire --quiet && /usr/sbin/dirvish-runall --quiet rc=$? # umount /backup || rc=$? exit $rc
#! /bin/sh # # daily cron job for the dirvish package # if [ ! -x /usr/sbin/dirvish-expire ]; then exit 0; fi if [ ! -s /etc/dirvish/master.conf ]; then exit 0; fi mount_check() { mntout=`tempfile -p mount` mount $1 >$mntout 2>&1 if [ ! -d $1/lost+found ]; then # only works for "real" filesystems :-) # (Yes, I know about reiserfs.) echo "'mount $1' failed?! Stopping." echo "mount output:" cat $mntout rm -f $mntout exit 2 fi if stat $1 | grep 'Inode: 2[^0-9]' >/dev/null; then # ditto rm -f $mntout return 0 # ok fi echo "$1 isn't inode 2 ?! Mount must have failed; stopping." echo '' stat $1 echo "mount output:" cat $mntout rm -f $mntout umount $1 exit 2 } ## Example of how to mount and umount a backup partition... # mount_check /backup ## Asheesh's locking addition fail() { echo "Cron job currently running; I'm outta here." exit 1; } die_if_dirvish_locked() { OTHER_PID=$(cat /var/lock/dirvish-cronjob 2>/dev/null) # if the PID file exists: [ -f /var/lock/dirvish-cronjob ] && ps "$OTHER_PID" 2>&1 >/dev/null && fail } lock_dirvish() { MY_PID=$$ echo "$MY_PID" > /var/lock/dirvish-cronjob } unlock_dirvish() { rm -f /var/lock/dirvish-cronjob } die_if_dirvish_locked lock_dirvish /usr/sbin/dirvish-expire --quiet && /usr/sbin/dirvish-runall --quiet rc=$? # umount /backup || rc=$? unlock_dirvish exit $rc