Vlad Fedorkov

Performance consulting for MySQL and Sphinx

How to avoid two backups running at the same time

When your backup script is running for too long it sometimes causes the second backup script starting at the time when previous backup is still running. This increasing pressure on the database, makes server slower, could start chain of backup processes and in some cases may break backup integrity.

Simplest solution is to avoid this undesired situation by adding locking to your backup script and prevent script to start second time when it’s already running.

Here is working sample. You will need to replace “sleep 10″ string with actual backup script call:

#!/bin/bash

LOCK_NAME="/tmp/my.lock"
if [[ -e $LOCK_NAME ]] ; then
        echo "re-entry, exiting"
        exit 1
fi

### Placing lock file
touch $LOCK_NAME
echo -n "Started..."

### Performing required work
sleep 10

### Removing lock
rm -f $LOCK_NAME

echo "Done."

It works perfectly most of the times. Problem is that you could still theoretically run two scripts at the same time so both will pass lock file checks and will be running together. To avoid that you would need to place unique lock file just before check and make sure no other processes did the same.

Here is improved version:

#!/bin/bash

UNIQSTR=$$
LOCK_PREFIX="/tmp/my.lock."
LOCK_NAME="$LOCK_PREFIX$UNIQSTR"

### Placing lock file
touch $LOCK_NAME
if [[ -e $LOCK_NAME && `ls -la $LOCK_PREFIX* | wc -l` == 1 ]] ; then
        echo -n "Started..."
        ### Performing required work
        sleep 10
        ### Removing lock
        rm -f $LOCK_NAME
        echo "Done."
else

### another process is running, removing lock
        echo "re-entry, exiting"
        rm -f $LOCK_NAME
        exit 1
fi

Now even if you managed to run two scripts at the same time only one script could actually start backup. In very rare situation both scripts will refuse to start (because of two lock files existing at the same time) but you could catch this issue by simply monitoring script exit code. Anyway – as soon you receive backup exit code different than zero it’s time to review your backup structure and make sure it works as desired.

Please note – when you terminate this script manually you will also need to remove lock file as well so script will pass check on startup. You could also use this script for any periodic tasks you have like Sphinx indexing, merging or index consistency checking.

For your convenience this script is available for download directly or using wget:

wget http://astellar.com/downloads/backup-wrapper.sh

You could also find more about MySQL backup solutions here.

Keep your data safe and have a nice day!

Category: Guide, Operations
Tag: ,
  • Sergei says:

    A simpler solution is to use mkdir instead of touch in the first script. Or ln -s /dev/null $LOCK_NAME. Or any other command that fails if the destination exists.

    October 16, 2012 at 12:23 pm
    • vlad says:

      Indeed directory-based locking seems more reliable, thanks for advice!

      October 18, 2012 at 7:31 am
  • ketan patel says:

    Why can’t you just use flock ?

    http://linux.die.net/man/2/flock

    October 16, 2012 at 1:48 pm
    • vlad says:

      Using flock and even mutex inside C/C++ code is generally better idea. Bash script is just more convenient way in case of periodical tasks like backups, MySQL maintenance tasks, log rotation, Sphinx indexing, etc running by cron daemon.

      October 18, 2012 at 7:30 am
  • Uli Stärk says:

    I think this is not a good solution, because a touch is not atomic and can lead to errors.

    You better use a perl/php/python/… script calling flock LOCK_EX to get an exclusive lock on a file. Its even better to get a mysql lock (GET_LOCK), because you could theoretically run the job from two distinct hosts :)

    October 16, 2012 at 1:53 pm
  • Rob Smith says:

    You really should be using the lock style that can be found at http://www.davidpashley.com/articles/writing-robust-shell-scripts.html under Race conditions:

    “It’s worth pointing out that there is a slight race condition in the above lock example between the time we test for the lockfile and the time we create it. A possible solution to this is to use IO redirection and bash’s noclobber mode, which won’t redirect to an existing file.”

    It also shows how to use traps to catch and remove the lock file after the script gets killed/termed/etc, which is important for backup scripts to clean up after themselves if they can

    October 16, 2012 at 5:19 pm
    • vlad says:

      Rob, thanks for the link, it’s great guide!

      October 18, 2012 at 7:20 am
  • mike says:

    Seems like this could lead to a race condition, you might want to use set -o noclobber or instead use mktemp -d since mkdir is atomic. Another common approach is to ‘kill -0′ the pid to verify the other job did not fail and neglect to clean up the lock file with a trap. (kill -9 is a potential pitfall still with traps)

    Look here for some ideas:

    http://lists.baseurl.org/pipermail/yum-devel/2011-August/008547.html
    http://wiki.bash-hackers.org/howto/mutex

    October 16, 2012 at 6:23 pm
    • vlad says:

      Indeed, using directory-based locking seems better idea! Thank you for the guides! I’ve also replied about race conditions below.

      October 18, 2012 at 7:24 am
  • vlad says:

    You all are absolutely right about possible race conditions and drawbacks. File system is an additional, relatively slow layer, locking behavior may vary depends on FS type and may not be atomic or thread safe. So if we’re talking about race condition prevention in parallel execution environment I would consider to use much faster and reliable in-memory mutex inside C/C++/Java/Python/etc code (as mentioned by Ketan and Uli) instead of file-base locking.

    At the same time backup scripts and other periodic tasks are mostly started using cron job once in a while and could barely cause race condition on the first place. In this case having unique lock names with attached process id is convenient way to implement external process monitor.

    October 18, 2012 at 7:17 am

Your email address will not be published. Required fields are marked *

*