Friday, December 31, 2010

BaculaFS v0.1.7: Batch Mode

Last night I released BaculaFS v0.1.7. I've added two main features:
  1. batch mode: in this mode BaculaFS behaves as a frontend for the bextract utility, and will launch it in order to extract the list of files, specified by BaculaFS' cache prefetch options, to the path pointed to by the mount point parameter
  2. cache prefetch from a list of files
These two features were designed to allow operations like the following:
  1. incremental update of a snapshot on a mounted storage device, in a single command:
    baculafs -o batch_extract,prefetch_diff=/path/to/snapshot,cleanup -o client=client-fd,fileset=client-fileset /path/to/snapshot/
  2. if the destination snapshot is on a remote file system, you can either mount it as a local filesystem (e.g. with sshfs) and then use -o prefetch_diff to prefetch the modified files before copying them with rsync, or generate the file prefetch list with rsync like this:
    # mount a view of the current bacula backup
    baculafs -o client=client-fd,fileset=client-fileset,prefetch_symlinks /path/to/first/mount/point
    # mount another view, this time with prefetch list generated by rsync
    rsync -in --out-format='/%n' -a /path/to/first/mount/point/ /url/or/path/of/remote/snapshot/ | baculafs -o prefetch_list=-,client=client-fd,fileset=client-fileset /path/to/second/mount/point
    # and now copy files for real
    rsync -a /path/to/second/mount/point/ /url/or/path/of/remote/snapshot/
    fusermount -u /path/to/first/mount/point
    fusermount -u /path/to/second/mount/point
Happy New Year!

Friday, December 10, 2010

SSH and DBUS Sessions

Applications that need access to the current D-BUS session bus, require special attention when launched from within an SSH session.

Sometimes it's enough to just set the DISPLAY environment variable to the appropriate X display number (e.g. localhost:0.0, or localhost:10.0 - the default when forwarding X connections over SSH), but it's usually not enough.

These applications need to know the so called D-BUS session bus address. This may be scraped from the environment of applications that are already running, and do have access to the session bus, as the value of DBUS_SESSION_ADDRESS, like this:
export $(strings /proc/*/environ| grep DBUS_SESSION | tail -1)
But there may be several possible values when more than one X session is used, and you'll need to select the right one, maybe by also matching the value of DISPLAY.

There is, however, a somewhat cleaner way to do this. The D-BUS environment variables may be set by running one of the machine generated files under ~/.dbus/session-bus - the files there all have names like 479864458729b195d5497c4bb663c100-10, where the string of hexadecimal digits before the dash is the machine UUID and the number after the dash is the X display number.

Here's how I do it in my ~/.bashrc:
session="$HOME/.dbus/session-bus/$(dbus-uuidgen --get)-$(echo $DISPLAY | sed -e 's/\([^:]*:\)//g' -e 's/\..*$//g')"
if [ -e $session ] ; then
    source $session
fi

Friday, November 5, 2010

Update No-IP.com and DynDNS.com Dynamic DNS with ddclient

I have registered several host names with two dynamic DNS service providers: DynDNS.com and No-IP.com, and I've been using ddclient and noip2 to update my dynamic IP address with these services, respectively.

I've recently installed a wireless router, so I poked at ddclient trying to figure out how to use it when the computer running it is behind a router, such that its IP address is an internal address (10.x.x.x) and not the external IP address that needs to be sent over to the dynamic DNS service.

This turns out to be pretty easy: just add use=web to /etc/ddclient.conf to have ddclient discover the external IP via the DynDNS.com IP checking service.

Along the way I've discovered that ddclient also supports No-IP.com - a fact that's only mentioned in the usage message that's displayed when running
ddclient --help
and after some futzing around I've come up with the following configuration file, in order to update both services with the same client, using their respective IP checking services:
use=web
web=http://ip1.dynupdate.no-ip.com/
protocol=noip, login=username, password='password' group_or_comma_separated_host_list
protocol=dyndns2, use=web, web=dyndns, server=members.dyndns.org, login=username, password='password' hostname
(according to the help message, it should've been possible to specify the first two lines inside the third line, but this doesn't work for some reason - I guess it's a bug).

Friday, October 29, 2010

Windows x64 Cross Compiling with Microsoft Visual Studio 2010 Express

Here's what works for me:
  1. Download, install and register Microsoft Visual Studio 2010 Express - it's free of charge
  2. Download and install the Windows SDK - be sure to install the x64 toolchain (and/or the Itanium toolchain)
  3. Create a shortcut on the desktop for each hardware platform you want to compile for:
    C:\WINDOWS\system32\cmd.exe /E:ON /V:ON /T:0E /C ""C:\Program Files\Microsoft SDKs\Windows\v7.1\Bin\SetEnv.cmd" /x86 & "C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE\VCExpress" /useenv"
    (replace /x86 with /x64 and /ia64 for x64 and Itanium, respectively).
    The "Start in" folder should be set to
    "C:\Program Files\Microsoft SDKs\Windows\v7.1\"
  4. Launch VCExpress via one of these shortcuts, and open a project that you want to cross compile. If it already contains the needed platform, then just switch to it and build. Otherwise, you should first use the configuration manager to add a new solution platform to the project - it's pretty straight forward

Friday, October 15, 2010

Cygwin/X Keyboard Layout Switching

The current Cygwin X server supports keyboard layouts! what a nice surprise! and it can all be configured at the command line!

Here's how I launch X on my wife's new laptop, with two keyboard layouts (us/il) and SHIFT-CAPS used to switch between layouts:
C:\cygwin\bin\run.exe /usr/bin/bash.exe -l -c '/usr/bin/startxwin -- -xkblayout us,il -xkbmodel pc105 -xkbvariant ,lyx -xkboptions grp:shift_caps_toggle'
Ain't it cool?

(you've probably guessed by now: I suffer from an extreme case of sleep deprivation)

Saturday, September 18, 2010

Automatically Launching Cygwin's X Server

We have a new Dell Inspiron 13R laptop. One of the first things I do on any Windows machine I get is install Cygwin on it. One of the main reasons for this is to be able to run an X Server.

I wanted the X Server to start automatically at login, so I copied the shortcut named "XWin Server" that was installed to the Start->Programs->Cygwin-X menu and pasted it into the Start->Programs->Startup menu. This shortcut launches Cygwin's startxwin, which, in turn, launches the XWin server itself.

This works nicely, except for one annoyance: startxwin, by default, launches an X terminal. This makes sense if you manually click the "XWin Server" shortcut, but not (in my opinion) when it's launched automatically. I wanted it to start silently.

I followed a few false trails before I bothered to Read The Fine Man-page. The solution is pretty simple: create an empty .startxwinrc in my home directory:
$ touch ~/.startxwinrc

Friday, September 3, 2010

Changes

A few weeks ago, following an overdose of bad luck, we bought a new laptop for my wife - a Dell Inspiron 13R (marketed locally as N3010).

Along with the new machine we bought an EDIMAX BR-6424n V2 wireless router, in order to network all of our three laptops.

About the same time, we finally replaced both of our failing cellular phones with a new pair of Nokia 2730phones.

And two weeks later my wife started a new part time job, where they issued her an ASUS Eee PC 1015P Netbook- when it rains it pours.

My plan is to setup the new box for my wife, take over her old laptop, and convert it to Debian (I'll probably replace its 80GB hard disk drive with a larger one).

So, I'm up to my neck with system administration chores. I intend to post some updates as soon as the dust settles down.

Wednesday, August 18, 2010

Cron Daemon Complains About PHP Warnings

A recent upgrade of php5-common brought with it an annoying side effect - the root account is being spammed with email messages from the Cron Daemon, every 30 minutes:
Date: Wed, 18 Aug 2010 20:09:01 +0300                                                                                                                                                             
From: Cron Daemon                                                                                                                                                        
To: root@machine-cycle.home                                                                                                                                                                       
Subject: Cron    [ -x /usr/lib/php5/maxlifetime ] && [ -d /var/lib/php5 ] && find /var/lib/php5/ -type f -cmin +$(/usr/lib/php5/maxlifetime) -print0 | xargs -n 200 -r -0 rm  
                                                                                                                                                                                                  
PHP Warning:  Directive 'register_long_arrays' is deprecated in PHP 5.3 and greater in Unknown on line 0                                                                                          
PHP Warning:  Directive 'magic_quotes_gpc' is deprecated in PHP 5.3 and greater in Unknown on line 0                                                                                              
First, I had to determine the package causing the trouble:
# dpkg -S /usr/lib/php5/maxlifetime
php5-common: /usr/lib/php5/maxlifetime
While some of the bugs listed at the Debian BTS seemed relevant, none seemed to match my exact situation.

I searched for the exact error messages I got, and found a few references on the PHP BTS. Some of these revolved around a problem with disabling warning reports from the PHP interpreter.

I decided it was time to follow the code instead of hyperlinks.

The file /usr/lib/php5/maxlifetime is a short shell script which calls the PHP command line interpreter, like this:
php5 -c /etc/php5/apache2/php.ini -r 'print ini_get("session.gc_maxlifetime");'
I tried it at the command line, and I got the same warnings. Got it!

But now what? after all, I know next to nothing about PHP...

Well, it did seem plausible that the problem had to do with the configuration file /etc/php5/apache2/php.ini, so I stared at it for a while until I found that warning messages can be disabled like this:
error_reporting  =  E_ALL & ~E_NOTICE & ~E_WARNING
I got rid of the annoying email messages alright, but I didn't really solve the problem - it's just a workaround.

Life's full of workarounds, and I'm slowly getting used to it.

Friday, July 30, 2010

Enlarge VirtualBox NTFS Disk Image

I ran out of disk space on my virtual Windows XP computer. I deleted some files and uninstalled a few unnecessary applications, but this didn't quite cut it, so I decided to enlarge the disk image from 10GB to 30GB - a sizeable upgrade, that should be enough for quite a while:
  1. backup and then shutdown the Windows VM
  2. create a new virtual disk with the appropriate size, via the File->Virtual Media Manager tool
  3. attach this disk image to the Windows VM via the storage settings dialog
  4. download the GParted Live ISO image (I used the "testing" image)
  5. mount the ISO image on the Windows' VM virtual optical drive (again, via the storage settings dialog)
  6. boot the VM from the ISO image (should happen automatically)
  7. use GParted to copy the NTFS partition from the small disk image, and then paste it to the new large disk image
  8. GParted should now ask you for the size of the new pasted partition: modify its size so that it takes up all available space on the new disk image
  9. apply the chages, and wait for the process to finish
  10. set the boot flag on the new partition
  11. close GParted and shutdown the VM
  12. remove the CD image from the virtual optical disk drive, and set the new disk image as the only disk image attached to the VM
  13. start the Windows VM - it should boot (unless you missed the boot flag part above) and then Windows will automatically run a disk check and start normally afterwards

Friday, July 23, 2010

MinGW 64bit Cross Compilation: rpl_malloc/rpl_realloc Missing

Are you cross compiling an autoconf-iscated project for a 64bit Windows target, using MinGW, and you're getting linker errors complaining about missing rpl_malloc and/or rpl_realloc?

e.g.
undbx.o:undbx.c:(.text+0xe2a): undefined reference to `_rpl_realloc'

The reason is explained in this autoconf mailing list thread.

The workaround is to disable the compatibility testing of malloc/realloc with the GNU C library, like this:
export ac_cv_func_realloc_0_nonnull=yes
export ac_cv_func_malloc_0_nonnull=yes
./configure --host=amd64-mingw32msvc
make

Tuesday, July 6, 2010

BaculaFS v0.1.5

Just released!

I updated the README and fixed a few bugs - most notably Issue #1, so that BaculaFS is now compatible with Bacula 5.0.

Please note that while I use BaculaFS on a daily basis, my Bacula database backend is SQLite 3, and I backup to disk, not to tape.

I do test BaculaFS before each release on a virtual machine, against both MySQL and PostgreSQL, but these tests aren't nearly as exhaustive as I'd like them to be.

So, please be patient if it dies on you, and please submit bug reports.

Enjoy!

Friday, June 25, 2010

New Antivirus

The past two months have been pretty hectic for me, compounded by a large dose of bad luck. I think it started with the plumbing problems we had, and then my computer got hit twice, after which my cellular phone decided there's really no need for me to send or receive SMS messages, several kitchen appliances stopped working at the least convenient timing, our leased car started warning about an imminent brake fault, several light bulbs burned out, our water cooler/heater/purifier/dispenser combo stopped working, our kids caught a nasty stomach flu - one after the other, my wife was hit by a toothache and an even more painful filling treatment, and I suspect we haven't seen the end of it, yet.

So, when NOD32, the anti-virus software that's running on my wife's box, started complaining that its update subscription is about to expire, I wasn't pissed off, I was just too tired.

I've decided to ditch NOD32 in favor of a free alternative - Microsoft's Security Essentials. While it seems that NOD32 is ranked better by AV-Comparatives, I've developed my doubts about the ranking methodology in general and NOD32's rank in particular.

At least in one case, NOD32 missed an obvious worm (a file with an .exe extension that's auto-run from an autorun.inf file), that found its way to my wife's USB flash drive (and which was detected by ClamAV on my Debian box - one product that is not even considered by AV-Comparatives).

Furthermore, Microsoft's Security Essentials detected another worm, right there on the laptop's C drive root folder, during the first quick scan that's run as part of the installation process.

MSSE also flagged UltraVNC as a potential (medium-risk) malware, but it was easy to convince MSSE to permanently ignore it.

It seems that file scanning in MSSE is much slower than in NOD32. Other than that it seems to be doing a decent job - it updates regularly, doesn't seem to slow Windows more than NOD32, and it has already managed to protect the laptop from catching one of the Conficker strains that resided on an infected USB drive, that my wife got from a work colleague.

Well, that's all for now - I'm off to hang a Hamsa on our door.

Friday, June 11, 2010

Icedove 3 Trouble

I've avoided upgrading Icedove to version 3 for a while due to trouble with enigmail (see Debian bug #562714). But after this was sorted out I decided to upgrade.

The upgrade brought with it a host of problems, which made the switch pretty annoying.

For starters, during the first run after the upgrade, Icedove transfered all its files to a new ~/.icedove directory. Good thing that I was alert enough to notice this, so I modified the list of directories I cherry pick for backup. But that was my only lucky break.

The new Icedove insists on asking me for a master password whenever it starts. It never used to do this. At first I thought this problem was related to the issue described in this Mozilla KB article, where Thunderbird prompts the user for a master passowrd even if none was set. But I do have a master password set, and it seems that this is simply a new feature. I could of course reset the master password, but then I'd have to retype all my other passwords. Tough choice.

The new Icedove also decided that it should sync all the folders in all of my IMAP accounts. I think I was asked about this during the upgrade, but I don't remember what option I selected at the time. I had to go to File->Offline->Download/Sync Now->Select... and un-mark every folder except Inbox and Sent (which I do want to sync) in each of the 5 IMAP account that I have.

And then there's this new message indexing feature. It's on by default, and it brought my poor headless laptop to its knees. The machine was busy indexing for a very long while, generating huge files on disk, and then re-indexing on every new email message that arrived. The huge index files also slow down the nightly backups and take up a lot of space, since they get modified every time Icedove is launched.

I decided to disable the indexing feature. Go to Edit->Preferences->Advanced->General->Advanced Configuration, de-select "Enable Global Search and Indexer", click the close button and restart Icedove. I'll reconsider it when and if I replace my current box with a faster one.

Friday, May 14, 2010

UnDBX v0.20: Recovery Mode

I've released a new version of UnDBX - a tool I wrote for extracting, recovering and undeleting e-mail messages from Outlook Express DBX files.

This is likely to be the last major release of UnDBX, since Outlook Express is defunct. It's been replaced by Windows Live Mail, which stores e-mail messages on disk as plain .eml files.

The main new feature in this release, and the reason for the version bump to v0.2x, is a Recovery Mode for extracting messages from corrupted DBX files. In Recovery Mode UnDBX can also (partially) undelete deleted messages from DBX files.

Other features and enhancements include:
  1. a new GUI launcher, that should be easier to use than the previous launcher script
  2. file names of extracted .eml files are constructed from the contents of the To:, From: and Subject: message headers
  3. the modification date of the extracted files is set to match the contents of the Date: header
  4. fixed crash bugs exposed by zzuf and valgrind

Enjoy.

Saturday, May 8, 2010

Busted External Disk

When it rains it pours.

Shortly after my laptop went headless, I started getting weird behavior when accessing files on one of my three external USB hard disk drives.

It's the largest disk (300GB) - the one that holds a backup of most of our DVDs. The one whose contents is not backed up by Bacula, simply because of its size.

I found that if I repeatedly use sha1sum or md5sum to compute the digest of some files, I get a different digest on every run.

Arghh!!

So I purchased a WD Elements 1TB External Disk, hooked it up, launched gparted, removed the NTFS partition, and reformatted it to Ext4. And then copied the contents of the bad disk to the new disk using rsync.

Most of the contents on the bad drive is multimedia, and multimedia players are designed to cope with errors in their input, so I guess I can live with the damage.

I wonder what's next to fail.

Friday, April 30, 2010

Busted Laptop

My 8 year old laptop's LCD backlight is busted - it just suddenly turned off on Monday morning. It stays off during a reboot. If I shut down the box and turn it on again, the backlight does turn on, but after a short (and random) period of time it turns off again.

I'm pretty sure that it's the backlight that's busted because, under the right lighting conditions, I can still see the stuff displayed on the screen, but it's very faint.

I've tried to manually control the screen power saving mode by blindly typing
vbetool dpms on
at the console, but this has no visible effect.

The box is otherwise just fine, and it still functions as a backup server, web server, firewall, printer server etc. It's just that I can't use it as a desktop.

What a bummer.

I'm still considering my options. I've been thinking of getting a new laptop for my wife for quite a while, since the (extended) warranty on her current, 4 year old, laptop expired, and she does need a lighter machine (2.7kg is pretty heavy...). But a new laptop would cost more than fixing my box.

Decisions, decisions.

To be continued.

Friday, April 23, 2010

Concatenating AVI Files with MEncoder

Suppose you have a bunch of AVI files, named part_01.avi, part_02.avi, etc. Suppose, furthermore, that you want to concatenate them into a single AVI file named concatenated.avi. Here's how to do it with MEncoder:
mencoder part_*.avi -o concatenated.avi -ovc copy -oac copy
This should work nicely as long as all parts have been encoded with the same video and audio encoders, using similar encoding parameters.

Now, suppose that you've downloaded a bunch of MP4 video files from, say, youtube.com using a command line tool like get-flash-videos (highly recommended!), and you want to concatenate them into a single AVI file. But some of these files have different encoding parameters than the rest, so we need to re-encode the files such that their parameters match:
ls -1 part_*.mp4 | \
while read f ; do \
 mencoder $f -o ${f/.mp4/.avi} -oac mp3lame -lameopts mode=2:cbr:br=128 \
 -ovc lavc -lavcopts vcodec=mpeg4:vbitrate=1200 \
   -vf scale=720:480 -af volnorm=1 -ffourcc XVID -ofps 29.917; \
done
where the parameters can be determined by examining the console output of MPlayer during playback of each file and selecting the parameters that match most files, in order to reduce quality loss as much as possible (you may find this script helpful).

Re-encoding is a time consuming process, so unless you absolutely need these files concatenated (e.g. for playback using your DivX DVD player), you may just want to playback the files in order...
mplayer -fs part_*.mp4

Friday, April 9, 2010

Installing Perl Modules on Debian

Disclaimer: I know next to nothing about Perl - I'd appreciate any comments.

Many CPAN hosted Perl modules are already packaged in Debian, so all one needs to do is the following:
  1. install dh-make-perl:
    aptitude install dh-make-perl
  2. find the Debian package that contains the wanted Perl module:
    dh-make-perl locate Chocolate::Belgian
  3. if the module has a matching Debian package, you should install it the usual way:
    aptitude install libchocolate-belgian-perl
  4. otherwise, you have two options - read on
One option is to to do it the Perl way, i.e. install the package using the CPAN core module:
perl -MCPAN -e 'install Chocolate::Belgian'
Another approach is to use dh-make-perl to automatically Debianize the module, and then install the resulting .deb package, like this:
cd /tmp
dh-make-perl make --build --cpan Chocolate::Belgian # or: cpan2deb Chocolate::Belgian
dpkg -i libchocolate-belgian-perl_0.01-1_all.deb
You may run into errors due to missing Perl modules that the original missing module depends on. If this happens, you'll need to repeat the installation process for each missing module (again, either from an existing Debian package or using dh-make-perl).

I suppose the latter method makes sense if, like me, you only need a small number of non-Debianized Perl modules installed - I like being able to manage these modules with the usual Debian tools such as aptitude, instead of having to learn to effectively use yet another package maintenance tool.

Saturday, March 27, 2010

No Sound (or: What the Plumber Taught Me About Linux)

Last week ended with our downstairs neighbor knocking at our door, to complain about a water leak from our apartment. He's a soft spoken fella, always calm, and always right. He annoys me to no end.

He was right of course. It took a minute to find the leak, close to the apartment's master water tap. It was a slow, but steady, dripping, that caused the trouble. The pipe itself looked all rusted and about to blow.

Quite a bit of water - and a hefty sum of money - went down the drain during the next few days, as a pair of nitwit plumbers worked on fixing the leak. I got a living proof that if anything gets rewarded it is initiative, not intelligence.

With a looming deadline at work, this was pretty much too much for my taste. Imagine my annoyance when I found, after a casual reboot of my Debian box, that I had no sound.

I noticed this one evening after that reboot, while trying to relax - I was attempting to view animated films over at NFB.ca. I tried turning up the volume with the knob on the speakers, verified that they were turned on and connected properly, and then used the volume multimedia keys on the keyboard to turn the volume up, and then used alsamixer at the command line to turn the volume up, and then... Nothing worked. There was no sound.

There were no obvious errors in the log files at /var/log. lspci listed the device as
00:08.0 Multimedia audio controller: ALi Corporation M5451 PCI AC-Link Controller Audio Device (rev 02)
lsmod told me that the relevant Kernel module, snd_ali5451, was loaded.

All is well, but still no sound.

I rebooted the box. Nothing.

I shut down the box and then turned it on again. Nada.

What now? I was stumped. I was pissed. I was tired. Can't a guy enjoy a cartoon when he feels like it?

Well, as I said, it's initiative that gets rewarded, so I continued fuzting around until I found alsactl:
root@machine-cycle:~# alsactl init
Unknown hardware: "ALI5451" "Analog Devices AD1886" "AC97a:41445361" "0x0e11" "0x00b0"
Hardware is initialized using a guess method
and voilĂ ! sound was working again.

I have no idea what went wrong, and I guess it's likely to happen on the next reboot, but till then I have sound and I really don't care.

Friday, March 19, 2010

Storing PuTTY's Configuration to a File

In the previous post I described how I partitioned my SanDisk Cruzer Micro USB flash drive, and installed grml Linux on it. The first partition on that flash drive is formatted as FAT32, so that I can use the flash drive on Window$ machines, for the normal tasks of moving files around between computers.

I also mentioned that I've placed a copy of PuTTY on that partition, in order to allow me to connect to my home PC, via secure shell (SSH) from any available Window$ box with an Internet connection. The problem here is that PuTTY's configuration is saved to the Registry, and it's lost whenever I switch to a different computer. It's rather easy to configure PuTTY, but it's tedious nonetheless, and I'd like to avoid it if I can.

After going through the PuTTY documentation I found section 4.26 - "Storing configuration in a file", which states that "PuTTY does not currently support storing its configuration in a file instead of the Registry", and then goes on to describe how to do it anyway, including a method for saving to file any modifications made to the configuration during the session.

I've opted for a less complicated setup, since I don't intend to modify the configuration. The first step is to place the following batch file, putty.bat, alongside the PuTTY executables
@ECHO OFF
regedit /s putty.reg
start /w putty.exe
regedit /s puttydel.reg
together with two .reg files - putty.reg and puttydel.reg. The first contains PuTTY's configuration, as saved (once) with the following command:
regedit /ea putty.reg HKEY_CURRENT_USER\Software\SimonTatham\PuTTY
and the second file, puttydel.reg is used to remove this configuration from the Registry of the host machine, once the shell session is over:
REGEDIT4

[-HKEY_CURRENT_USER\Software\SimonTatham\PuTTY]
This should work fine for password based authentication, but not for public-key authentication. The problem is that the full path to the key file is stored in the Registry, which includes the drive letter, and this drive letter is liable to be different when switching computers.

The solution is easy: edit putty.reg with a text editor you like, and modify the value of the key PublicKeyFile to be relative to the PuTTY folder on the flash drive. So that, for example, if you've placed a key file named putty.ppk in a sub-folder named keys under the folder where the rest of the PuTTY files are, then you should set the value of PublicKeyFile to ".\\keys\\putty.ppk"

One last tip: save the PuTTY Registry after you connect at least once to the target PC, thus not only verifying that the configuration actually works, but also letting PuTTY save the target host key to the Registry. This will avoid the need to confirm the host key whenever you connect to it from another PC.

Friday, March 12, 2010

Installing grml Linux on a USB Flash Drive

Last time I mentioned that I needed a USB flash drive for storing the GnuPG encryption keys and Amazon S3 access keys, which are used for backing up my files to the cloud.

The reason should be obvious: if I ever need to restore files from the remote backup, it would probably be because I either lost both my Debian box and Bacula backup disk, or that I can't access them for some reason. So I better have those keys available elsewhere.

So I purchased a 4GB SanDisk Cruzer Micro USB flash drive. But then I decided to install Linux on it, instead of just using it to store those key files.

Sick? I don't think so. After all, if I ever get that desperate, it's likely that I'll need to setup a Linux workstation with, at a minimum, GnuPG and duplicity installed on it. It would be much simpler if I could just boot any available PC with my USB key and be able to restore my files to some local storage device.

The only remaining question was which distro should I install? I wanted a distro that can be installed to a USB flash drive, which can be tailored ("remastered") to include up-to-date versions of s3cmd, duplicity and GnuPG, and support persistency of some sort (for storing the access keys and other configuration files).

DistroWatch.com and Google provided some leads. There are many Live CD distros available, and quite a few minimal distros meant for USB flash drives. I further narrowed down my search by going for a Debian or Ubuntu based distro, which I'm familiar with, and figured I should first try the official releases: Debian Live and Ubuntu.

I've spent a few evenings trying to figure out Debian Live, and failed miserably. Debian Live is extremely configurable - it attempts to be the universal Live CD build system - and as such is pretty complex, at least for my taste and limited free time. The documentation seems to be out of date (e.g. it mentions commands such as lh_config and lh_build, where the actual command is lh) and the Debian Live wiki is a rather messy collection of apparently out-of-date howtos. Bottom line is that I just couldn't get past some error messages and gave up on it.

I ditched Debian Live in favor of the Ubuntu Live CD, which seemed less complicated to customize, and can also be installed to a USB flash drive.

I actually managed to generate a Live CD image with extra packages installed (s3cmd, duplicity and a few others) and some packages removed (such as the Ubuntu documentation). The only problem was that this image failed to boot under QEMU. Well, actually, it just hang during the startup process, showing a throbbing Ubuntu logo. It can very well be that I simply didn't wait long enough, and/or that my trouble had to do with a package that I should not have removed/installed - but I had neither the patience nor time to investigate.

This was rather frustrating. I didn't expect this to be so complicated. As I was about to settle for my original plan of placing the few key files I needed on the USB flash drive, I searched again for a Debian based live distro and found grml - "Linux Live system for sysadmins / texttool-users / geeks".

Grml is developed by several Debian developers, it's based on Debian/Testing, and it's currently actively maintained. Grml already includes s3cmd and duplicity, and a lot more - it even has awesome - ain't that awesome?!

This all looked very nice, and all that was left for me to do was to install it to my USB flash drive:
  1. download the grml iso image and burn it to a CD
  2. partition your USB flash drive (e.g. with gparted) such that it contains the following partitions, assuming its block device is /dev/sdb (adapted from the grml persistency howto):
    1. /dev/sdb1 - filesystem: FAT32, label: datastore, type: primary, size: 2GB
    2. /dev/sdb2 - filesystem: FAT16, label: grml, type: primary, bootable, size: 800MB (enough for one CD image)
    3. /dev/sdb5 - filesystem: EXT3, label: live-rw, type: logical, size: 1GB
    4. /dev/sdb6 - filesystem: EXT3, label: GRMLCFG, type: logical, size: 200MB
    and tune the EXT3 filesystems:
    tune2fs -i 0 -c 0 -m 1 /dev/sdb5
    tune2fs -i 0 -c 0 -m 1 /dev/sdb6
  3. boot a PC with this CD, press "q" at the prompt to enter the shell
  4. insert the USB flash drive
  5. run
    grml2usb --bootoptions="persistent" /live/image /dev/sdb2
    live-snapshot -d /dev/sdb5 -t ext3
    umount /live/live-snap*
  6. shutdown the PC, remove the CD and reboot it from the USB flash drive - if all goes well, grml should come up, with all system modifications loaded from, and subsequently saved to, the persistent live-rw partition
The larger FAT32 partition is used for storing files that should be accessible on a Window$ PC. At the moment I have Putty and TightVNC (viewer only) "installed" on it, allowing me to connect to my Debian box from any windows PC with access to the Internet - but I'll leave this to a future post.

Friday, March 5, 2010

Backup to The Cloud

I've mentioned before that I was looking into off-site backup options. I also reported that I got to the conclusion that the upload bandwidth provided by my ISP is too low to be useful for such a backup plan. My napkin calculations indicated that I'll need 7 days for uploading a full backup snapshot, and 4 to 5 hours to upload an incremental nightly backup.

Well, I'm glad to report that I now have a working off-site backup scheme, based on duplicity, where the nightly backup (to Amazon's S3) takes, typically, just a few minutes to upload. The data transfer is secured with SSL, and the data itself is compressed and encrypted with GnuPG. Furthermore, the off-site backup is an accurate mirror of the contents of my Bacula backup storage, which contains a backup of both my wife's WinXP laptop, and my own Debian box.

The initial full backup was a bit of a pain to setup, though. I first did a full restore from Bacula's backup to an external disk, and used duplicity to generate the initial full backup set on my other external disk. This took a few hours to complete, and then I uploaded the full backup set to S3 using s3cmd's sync command. It took 8 days to complete the upload, with several interruptions, but mostly keeping a pretty constant upload rate of about 60KB/s, which is close to the nominal 64KB/s that it's supposed to be.

Once the initial upload was done, I added a new job to my Bacula configuration that triggers a nightly backup script to run after the last nightly backup job. In the script (shown below) I use BaculaFS to mount Bacula's storage as a filesystem (for each client and fileset) and then I use duplicity to backup its contents. I also backup duplicity's cache directory to an external disk, just in case.

Note the use of duplicity's --archive-dir and --name command line options. These options allow the user to control the location of duplicity's cache directory, instead of the default, which is a path name based on the MD5 signature of the target URL. I needed to do this because I moved the backup set from one URL (local disk) to another (S3), and I didn't want duplicity to rebuild the cache by downloading files from S3, but rather use the cache that was already on my box.

In the few days I've been using this scheme, I haven't hit any communication failure with S3 during backup. I expect I'll have to modify the script in order to handle such a failure, but I'm not sure how - at the moment I just abort the snapshot if the point-to-point network adapter is down.

The next step is to setup a disk-on-key with the relevant GnuPG encryption keys and S3 access keys, in order to allow me to restore files from S3, in case both my box and my Bacula backup go kaput.

To be continued.


#! /bin/bash

export HOME=/root
SCRIPTDIR=$(dirname $0)
ARCHIVE=$HOME/.cache/duplicity
ARCHIVE_BACKUP=/mnt/elements/backup/duplicity
DEST=s3+http://my-bucket
SRC=/mnt/baculafs
DUPLICITY_OPTS="--encrypt-key=XXXXXXXX --sign-key=YYYYYYYY --archive-dir=$ARCHIVE"
DUPLICITY_LOGLEVEL="-v4"

# make sure we're connected to the internet
PPPIFN=$(cat /proc/net/dev | grep ppp | wc -l | awk '{ print $1 }')
if [ "$PPPIFN" == "0" ]
then
    exit 1
fi

snapshot()
{
    umount $SRC/$2 2>/dev/null

    export AWS_ACCESS_KEY_ID=$(grep access_key $HOME/.s3cfg | cut -d\  -f3)
    export AWS_SECRET_ACCESS_KEY=$(grep secret_key $HOME/.s3cfg | cut -d\  -f3)
    export PASSPHRASE=$(cat $SCRIPTDIR/duplicity-gpgpass)

    echo "Generating list of current files in backup ..."
    duplicity list-current-files -v0 $DUPLICITY_OPTS --name=$2 $DEST/$2 > /tmp/$2.list || exit $?

    echo "Mounting BaculaFS ..."
    baculafs -o client=$1-fd,fileset=$2-fileset,cleanup,prefetch_difflist=/tmp/$2.list $SRC/$2 || exit $?

    echo "Prunning remote snaphsot ..."
    duplicity remove-older-than 1W $DUPLICITY_LOGLEVEL $DUPLICITY_OPTS --name=$2 --force $DEST/$2
    
    echo "Updating remote snaphsot ..."
    duplicity $DUPLICITY_LOGLEVEL $DUPLICITY_OPTS --name=$2 $SRC/$2/ $DEST/$2

    unset PASSPHRASE
    unset AWS_SECRET_ACCESS_KEY
    unset AWS_ACCESS_KEY_ID

    fusermount -u $SRC/$2

    echo "Backup local duplicity archive ..." 
    rsync -av --delete $ARCHIVE/$2/ $ARCHIVE_BACKUP/$2/
}

snapshot winxp winxp
snapshot machine-cycle machine-cycle
snapshot machine-cycle catalog

Friday, February 19, 2010

Backup MySQL Databases (revisited)

I've once described how I backup MySQL databases on my box. A few days ago I noticed an error message in the backup job summary emails that Bacula sends me every day:
mysqldump: Got error: 1044: Access denied for user 'debian-sys-maint'@'localhost' to database 'information_schema' when using LOCK TABLES
A quick search got me to Debian bug #550037, which seemed rather relevant.

Here's the core of my revised MySQL backup script (note the special handling of the information-schema database):
backup_mysql ()
{
    echo "Backing up MySQL databases..."
    rm -rf "$1"
    mkdir -p "$1"
    mysql --defaults-file=/etc/mysql/debian.cnf --batch --skip-column-names -e "show databases" |
    while read DB ; do
        echo Dumping "${DB}" ...
        if [ "${DB}" == "information_schema" -o "${DB}" == "performance_schema" ]; then
            OPT="--skip-lock-tables"
        else
            OPT="--opt"
        fi
        mysqldump --defaults-file=/etc/mysql/debian.cnf "${OPT}" "${DB}" > "$1/${DB}.sql"
    done
}
[04 Jun 2012] UPDATE: fixed handling of another special database: performance_schema

Friday, February 12, 2010

Dynamic Bacula Filesets

Bacula provides a rather flexible method for specifying the files and directories to include in and/or exclude from backup jobs - the Fileset resource.

It is, in fact, so flexible that you can use an external script or program to generate the list of files to backup, on the fly. That program is expected to dump a list of paths to backup to standard output, which is piped to Bacula, either at the server side (director):
File="|/path/to/script <args>"
or at the client side (file daemon):
File="\\|/path/to/script <args>"

Running the fileset generation program at the client side has the advantage of running with super-user privileges.

Here's an example, adapted from the Bacula manual, for backing up all local Ext3 disk partitions
FileSet {
    Name = "All local partitions"
    Include {
        Options { signature=SHA1; onefs=yes; }
        File = "\\|bash -c \"df -klF ext3 | tail -n +2 | awk '{print \$6}'\""
    }
}
Note that, if there are other include/exclude criterions in the Fileset, the file daemon still has to determine which files and directories it has to backup, under each parent directory that is specified by the external program.

A similar method can be used to completely workaround Bacula's file selection logic. One reason to do this would be to select files according to criteria that cannot be expressed using the normal Fileset resource definition syntax (e.g. file selection by date).

I became interested in this after I learned that if I specify /grandparent/parent/child/file as a backup target, Bacula does not backup the permissions and ownership info of any of the parent directories. This happens because none of the parent directories is a backup target itself, or a sub-directory of a directory which is a backup target.

This isn't a bug, but rather just the way Bacula is designed. It actually makes sense when you think about it. But the end result is that if you cherry pick directories to backup (like I do), you may end up with some non-obvious permissions/ownership problems upon a full restore, due to the fact that some parent directories were not specified as backup targets.

Turns out it's so tricky and cumbersome to get the behavior I want using the usual Fileset definition constructs, that using an external script for selecting files is actually the easy solution to my problem.

There are, however, a few gotchas that I had to address before I could deploy this scheme.

Say we have a program (more specifically: a Python script called fileset.py) that, when run, dumps a list of files and directories to standard output, which we wish to backup. The Fileset resource we would use in this case looks like this (for a Linux client):
FileSet {
    Name = linux-client-fileset
    Include {
        Options {
            signature = SHA1
            compression = GZIP
            wild = "*"
            Exclude = yes
        }
        File = "\\|/usr/bin/env LANG=en_US.UTF-8 /usr/bin/python /path/to/fileset.py <args>"
    }
}
If you ponder this for a bit you'll note that this definition excludes any file/directory that does not appear in the list dumped by our script - which is exactly what we want. The tricky part here, is that the list has to be reverse sorted, such that any sub-directory path appears in it before its parent directory, otherwise Bacula will filter it out.

Another issue, which I was not aware of initially, is that the locale information isn't propagated to the sub-process running the script. The tricky bit here is that locale information is propagated after manually restarting the file daemon process - the restarted process seems to inherit the environment settings of the shell that was used to restart it. A simple solution is to explicitly specify the value of the LANG environment variable, as I've done above.

The next issue I had to tackle was that when the file list is generated by a script, it's apparently generated before the client executes any of the ClientRunBeforeJob scripts that are configured in the backup job definition. This means that, if you create new files as part of the operation of the pre-backup scripts, these files will not be included in the current backup job. This is different than the normal state of affairs.

I had to split the backup jobs for each of the client machine that I backup into two jobs: one job runs the ClientRunBeforeJob scripts, but uses an empty fileset (i.e. one that doesn't have any File directive), and a second job that runs afterwards and uses the dynamic fileset.

The last problem, but not the least, was getting this scheme to work both on Window$ and Linux, with filenames that happen to contain illegal characters, using the same Python script. This was an interesting exercise in its own right, but I'll leave that to a future post.

OK, so it's complicated, and the benefits are dubious, but you've read so far. You're too kind. Thanks.

Friday, February 5, 2010

Building a Custom Kernel from Source

I've managed to avoid this for quite a while, but there was no escape this time.

On my last blog post I mentioned that I've hit a problem with the new Debian/testing Kernel (2.6.32).

After a quick search I found Kernel bug #14791, which seemed to be my exact problem. The good news was that there was a patch available, the bad news was that it was "[dropped] from the list of recent regressions due to the lack of testers".

So I stepped forward and decided to test the patch, hoping, first, that it does indeed fix my problem, and second, that it would be incorporated in the next Kernel release, if and when I verified that it worked.

But this meant that I needed to compile an upstream Kernel and install it. Twice (once to verify that the most current Kernel has this problem, and another to verify that the patch fixes it).

Brrr.

Well, doing this, on a Debian box, turns out to be actually rather easy, as long as the upstream Kernel isn't too far removed from a Kernel that's already installed on your box. It's even easier than compiling the official Debian Kernel source package - trust me, I tried.

The official guide is at the Debian Linux Kernel Handbook, where chapter 4 describes common Kernel related tasks such as Building a custom kernel from the "pristine" kernel source (section 4.5).

So here's how I did it, using Kernel source code from the mainline Git repository:
  1. get the source code:
    git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
  2. copy the current Kernel configuration to the source tree:
    cd linux-2.6
    cp /boot/config-2.6.32-trunk-686 ./.config
  3. configure the Kernel (based on the current configuration):
    make oldconfig
    you'll be presented with a series of configuration questions (mostly, I just selected the default options by hitting ENTER repeatedly) - this is likely to be a short process, as long as you're compiling a Kernel that's similar enough to the one from which the base configuration was taken
  4. run:
    make-kpkg clean
    fakeroot make-kpkg --initrd --revision=foo.1.0 kernel_image
    the string foo.1.0 will be the version of the resulting Debian package
  5. after a rather long while the build will, hopefully, finish successfuly, and you'll be left with a Kernel Debian package in the parent directory ../linux-image-2.6.33-rc5_foo.1.0_i386.deb
  6. install the new Kernel (as root):
    dpkg -i linux-image-2.6.33-rc5_foo.1.0_i386.deb
  7. for some reason the above step did not generate an initramfs image, so you may need to create one:
    update-initramfs -k 2.6.33-rc5 -c
    update-grub
The end of the story is that I verified that the patch does fix my problem, and after I reported my findings, the patch was submitted by its author, and was soon accepted.

It ain't much of contribution, I know, but it's definitely more than I had expected to be making...

Friday, January 29, 2010

Selecting a Default Kernel to Boot (GRUB2)

A recent Kernel upgrade (2.6.32) brought with it a nasty regression: the Kernel does not detect that my USB cable modem link is up.

But I still have the previous Kernel (2.6.30) installed, so the workaround is rather simple - make this Kernel the default one in GRUB2.

I didn't bother looking this up anywhere, but the following procedure works for me (as root):
  1. open /etc/default/grub for editing
  2. modify GRUB_DEFAULT:
    GRUB_DEFAULT=2
    the number in red is the number of the GRUB2 menu entry you wish to make default (the first menu entry number is 0)
  3. run
    update-grub

Friday, January 22, 2010

Creating Floppy Disk Images

Here's a recipe for creating a floppy disk image from scratch:
  1. install Mtools:
    aptitude install mtools
  2. make sure you have the following in /etc/mtools.conf:
    # # dosemu floppy image
    drive n: file="/var/lib/dosemu/fdimage"
    
    (it's the default configuration on Debian)
  3. create the directory:
    mkdir -p /var/lib/dosemu/ # as root
  4. create a 1.44MB floppy disk image with mformat:
    mformat -f 1440 -C n:
    or, if you happen to have your own bootsector binary file available:
    mformat -B bootsector -f 1440 -C n:
  5. loop mount the floppy image:
    mount -o loop /var/lib/dosemu/fdimage /mnt/image
  6. populate the image with files
  7. unmount the image:
    umount /mnt/image
  8. Rinse and repeat

Friday, January 15, 2010

BaculaFS: Bacula Filesystem in USErspace

I've recently released BaculaFS. BaculaFS is a tool that exposes the Bacula catalog and storage as a read-only Filesystem in USErspace (FUSE).

If you have setuptools installed you can get and install BaculaFS like this:
easy_install BaculaFS
and then run it like this (as root):
baculafs [options] [mount point]
Note that, depending on your setup, it can take quite a while for BaculaFS to start up, as it runs the necessary queries against the Bacula catalog database (if you're impatient you can try -o logging=debug - it won't accelerate anything, but the feedback may ease your pain).

I wrote BaculaFS from the bottom up, and I guess it shows. Nevertheless, I think it's pretty cool. I use it with rsync, Baobab, and other file oriented tools (I even managed to clone a Git repository directly from backup).

Unfortunately, extracting files from the Bacula backup storage can be rather slow. This can make BaculaFS barely usable with Nautilus (and probably with other file managers), because it is simply too aggressive, attempting, by default, to read each and every file it encounters.

I hope some of you will find it useful. If you happen to hit any problem please submit a bug report.

Enjoy.

Friday, January 8, 2010

Can't Delete Print Jobs on Windows

My brother-in-law called me up the other night. He's taking a C programming course, and I half expected him to ask me for some help with his homework. I was wrong - he seems to be handling it quite well, for a guy who never programmed before.

I was disappointed to hear that he called me up to fix a printing problem. He complained that he can't print on his Window$ XP PC, because a print job that he cancelled has not been removed from the printer queue (it's status is "Deleting - Printing" - WTF?), so all the subsequent jobs were stuck, waiting for the first one to finish.

Well, I have a reputation to keep. I sat down at my wife's laptop, so that I could both follow his description of what he already did, and guide him through any suggestion of mine. Turns out he already tried most of my suggestions, including power cycling both his printer and his PC.

I had one last idea: Google. I came up with several links (including a Micro$oft support article and this blog entry), from which the following procedure was derived:
  1. click the Start bottom and then select "Run..."
  2. stop the spooler service by typing
    net stop spooler
    and then hit Enter or press OK
  3. launch the file explorer (e.g. by hitting the Windows Key and "e" together)
  4. navigate to the folder
    %SystemRoot%\System32\Spool\Printers
    the %SystemRoot% bit is supposed to be automatically replaced with the path to the Window$ system folder (e.g. C:\WINDOWS)
  5. delete the *.spl and *.shd files that show the approximate time and date of the print job causing the problem (or delete everything there, if you don't mind losing other pending print jobs)
  6. click Start, select "Run..."
  7. restart the spooler service by typing:
    net start spooler
    and hit Enter or press OK

Friday, January 1, 2010

Can't Connect to VNC Server

A few days ago I found that I could not connect to a standalone VNC server session that I've started on my home PC with
vnc4server
I checked the log file at ~/.vnc/machine-cycle:1.log. The log file contained a few error messages about a missing /etc/X11/xserver/SecurityPolicy and missing font directories under /usr/X11R6/lib/X11/fonts/, but I was rather sure that I've seen these messages before, so it wasn't as helpful as I hoped.

I checked the Debian BTS page for vnc4server and my problem was right there as bugs #561619 and #560137. The comments for the latter provide both an explanation (it's related to IPv6 - way over my pretty head), and a workaround that fixed the problem at my end - run the following as root:
sed -i 's/net.ipv6.bindv6only\ =\ 1/net.ipv6.bindv6only\ =\ 0/' /etc/sysctl.d/bindv6only.conf && invoke-rc.d procps restart
That's good enough for me.