Friday, March 2, 2012

Scripting a Simple Boot Time State Machine in GRUB2

There's a hardware compatibility issue between my laptop's on-board nVidia nForce4 antiquated and buggy SATA interface and the modern Western Digital hard disk that I've installed in it.

The hard disk seems to disappear in mid-boot - this can happen anytime it is being re-detected: when BIOS transfers control to the boot loader (either GRUB2 or the Windows Boot Loader), and when the initial kernel image (loaded by the boot loader) re-initializes the SATA controller. If the system comes up, then any power management event may cause the disk to disappear. Sometimes, the disk just disappears for no obvious reason. The kernel's libata error handling routines may manage to reestablish connection to the disk, but sometimes it does not come up and I have to power cycle the laptop.

None of the solutions/workarounds I tried did the trick: apart for the stuff I already mentioned, I added rootdelay=10 to the kernel command line, hoping that the extra delay would give libata a better chance of re-detecting the disk; I even installed the watchdog daemon, to get my laptop to reboot when it freezes, but either it doesn't really work, or (most likely) I don't understand how to configure it.

The last ditch workaround I came up with, was to boot the laptop into GRUB2 from a USB disk-on-key, thus avoiding the internal hard disk altogether, and have GRUB2 perform the following steps:
  1. check if boot related files on the internal hard disk are accessible
  2. if any file is not accessible then reboot
  3. if all files are accessible, then if this is the first success then reboot
  4. if this is the second success then boot GRUB2 on the internal hard disk
This procedure captures the sequence of steps that seems to get my box to boot - it's Voodoo, I know, but it does work (most of the time).

The script (boot/grub/grub.cfg) below implements this procedure. It uses get_env and save_env to keep a persistent state (the variable need_reboot) between reboots. The first few lines make sure that the state is properly initialized, even if the environment variables were not saved yet (i.e. if this is the first ever boot from this device).

The fact that one can control the boot process this way is pretty neat, in my opinion. The only trouble is that even with this, my system still fails to boot - albeit less frequently than before.

I must admit defeat. The laptop is usable now, but, in my eyes, just barely. The whole experience made me itch for a new PC, but with my luck being what it is, I fear that it won't be a much easier ride. So, for now, I've resigned myself to my fate, until I gather up the courage (and cash) to tackle a new box.

set need_reboot="yes"
if [ -s $prefix/grubenv ]; then

insmod part_msdos
insmod ext2
set root='(hd0,msdos1)'
set reboot_delay=5

insmod ntfs
insmod vbe
insmod vga
insmod video_bochs
insmod video_cirrus

function wait_and_reboot {
  echo "Reboot in ${reboot_delay} seconds ..."
  if sleep --interruptible ${reboot_delay} ; then

function find_or_die {
  if [ -e $1 ]; then
    echo "$1 found"
    echo "$1 NOT found!"

set menu_color_normal=cyan/blue
set menu_color_highlight=white/blue

find_or_die (hd1,msdos1)/Boot/BCD
find_or_die (hd1,msdos2)/config.sys
find_or_die (hd1,msdos5)/boot/grub/grub.cfg

if [ -z "${need_reboot}" ]; then
     set need_reboot="yes"
     save_env need_reboot
     set need_reboot=
     save_env need_reboot

set timeout=30
menuentry "GRUB on internal hard disk" {
        set root='(hd1)'
        drivemap -s (hd0) (hd1)
        chainloader +1


  1. Interesting that one can run 'scriptlets' inside GRUB2.

    One question, based on the above scripts, I'm starting to think that it is possible to boot a certain menuentry based on time-of-day; am I in dreamland and someone, anyone, needs to wake me up or is this actually possible? I'll start testing in a virtual machine to see what I can come up with. However, if you already have a script for this and are willing to share, then by all means, I'm all for not reinventing the wheel!

    Eagerly awaiting your response as I'm quite tired of waiting for the boot list to appear when I need to boot into windows on my laptop when I get to the office! I could be doing other important stuff like getting my cuppa and/or making small talk with my colleagues - :D!


    1. From looking at the grub2 source code, it seems that if you add

      insmod datehook

      to your grub.cfg, the following environment variables will be accessible: YEAR, MONTH, DAY, HOUR, MINUTE, SECOND, WEEKDAY

      I haven't tried it myself, but it looks promising...

  2. Thank you Zung. Your suggestion worked like a charm. See my solution to the problem at Linux Questions via the link below:


    1. Thank you TonyK!
      Your solution worked for me to unnatended booting of PCs in a classroom.