Thursday, May 3, 2007

Upgrading from "etch" to "lenny": Upgrading Bacula

As I said previously, the upgrade process to Debian "lenny" was painless enough, except for some problems with Bacula. I read the Bacula 2.0 release notes before upgrading, and I suspected problems in three areas:
  1. Storage device configuration: I use an external hard disk, connected via USB - and version 2.0 includes some improvements regarding such devices.
  2. Database: the database format has changed (I use the sqlite3 package), and it's necessary to convert the database to the new format, using a migration script.
  3. Scripts: I've configured several scripts to be run by the Bacula Director Daemon and the Bacula File Daemon, that perform some chores before and after the backup process. The scripting facility has been significantly overhauled in Bacula 2.0. The changes include modifications to the configuration file syntax, but the previous syntax is still available (e.g. the RunBeforeJob directive is implemented as a shortcut for a predefined RunScript block).
I half hoped that everything would just work, but I did not kid myself.
And sure enough - sh*t happened:
  1. I had no problem with the storage device - it's mounting is handled by an external script.
  2. The database conversion script was run automatically during the upgrade process, and did something really bad to my database - it contained no data after the upgrade!
    I intended to clear it anyway, because I wanted to split the backup pool in two - one pool for full backups and one pool for incremental/differential backups. But if this hadn't been my intent, I would've been left with a real problem.
    I did not investigate this any further. YMMV.
  3. The improved scripting facility caused an interesting problem:
    Background: the windows backup job inherited its properties from a default job common to all backup jobs. One of these is a ClientRunBeforeJob directive, that cannot be used on a windows machine, so that the windows backup job overrides this with its own ClientRunBeforeJob directive.
    Problem: the new RunScript facility allows several scripts to be specified, each with its own set of properties, so that the ClientRunBeforeJob directive in the windows backup job specification did not override the default job, but rather added another one. It so happens that this script was run first (on the windows machine) and then the File Daemon tried to run the default job's client script - this caused an error and the backup process died.
    Solution: I split the default job - one default job for Linux and one for Windows.
I kept the old backup for a week or so before I decided that the new setup works. It's not as if I had any option (I had no intention of going back to version 1.38), but it felt somehow more appropriate to wait.

No comments:

Post a Comment