Friday, February 24, 2012

Forcing SATA Speed Limit to 1.5Gbps

Recap: I've upgraded the hard disk on my laptop, installed Win7 Ultimate on it from upgrade media, installed Debian GNU/Linux to a another partition on the same disk, and configured the Windows Boot Loader for dual boot. I was very pleased. It all seemed to work well, except for a small power management issue.

I got busy installing all the stuff I needed on both operating systems, setting up printers and other hardware, configuring mount points, network shares, backup, ssh, etc. All in all, this took a few days, during which the new disk withstood a lot of read/write operations (gigabytes at a time). Most of this work was rather boring.

And then, following one of many reboots, the Windows Boot Loader failed to start:
Windows failed to start. A recent hardware or software change might be the cause. To fix the problem:

    1. Insert your Windows installation disc and restart your computer.
    2. Choose your language settings, then click "Next."
    3. Click "Repair your computer."

If you do not have this disc, contact your system administrator or computer manufacturer for assistance.

    File: \Boot\BCD
    Status: 0xc000000e
    Info: An error occurred while attempting to read the boot configuration data.


ENTER=Continue
My first reaction was panic. My next reaction was to power cycle the laptop. It came up just fine: Windows seemed to work OK, and another reboot confirmed that the Linux partition was alive too.

I wasn't pleased anymore.

I searched for that error code, and found a lot of complaints, and a lot of "solutions" - but in most cases the error was permanent, and a reboot did not make it go away. This was a bad sign, but I had very little to work on, so I let it go.

Two days later Debian suddenly hard-locked on me. After power-cycling the box I was not able to find any error message in any of the log files.

And a few days after that, Debian failed to boot, dropping me into a limited shell, claiming that the boot device was missing.

After yet another Windows Boot Loader failure, I used grub-install to replace it with GRUB2, in the slim hope that it would fix the problem. Even if this wouldn't fix the problem, at least I'd be able to inspect the system when it fails, from within the GRUB2 shell.

GRUB2 did not fix anything: the laptop would still occasionally fail to boot - either GRUB2 would fail to find its own files, or the kernel would start but later on fail to find the root device.

During all that time, when my laptop was up and running, it would occasionally freeze for brief, but noticeable, periods of time. It took me a while to realize that this was not a case of a slow-to-load website, or icedove just taking its sweet time starting up - these hiccups were correlated with messages like the following being logged to /var/log/kern.log:
ata3: ATA_REG 0x40 ERR_REG 0x0
ata3: tag : dhfis dmafis sdbfis sactive 
ata3: tag 0x0: 1 1 0 1  
ata3.00: exception Emask 0x0 SAct 0x1ff SErr 0x0 action 0x6 frozen
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/08:00:00:f8:4d/00:00:0c:00:00/40 tag 0 ncq 4096 out
         res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3.00: failed command: WRITE FPDMA QUEUED
ata3.00: cmd 61/08:08:18:f8:4d/00:00:0c:00:00/40 tag 1 ncq 4096 out
         res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

...

ata3.00: status: { DRDY }
ata3: hard resetting link
ata3: nv: skipping hardreset on occupied port
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata3.00: configured for UDMA/133
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0 
ata3.00: device reported invalid CHS sector 0 
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3.00: device reported invalid CHS sector 0
ata3: EH complete
At first glance it seemed to me that the hard disk was failing. I subjected it to a slew of tests (fsck, smartmontools, MHDD), and in all of them the disk was found to be in excellent shape.

The next suspect in line was the on-board SATA controller. The laptop's mobo is based on the nVidia's nForce4 430 (MCP51) chipset. Wikipedia has this to say about it:
There have also been data corruption issues associated with certain SATA 3 Gbit/s hard drives.
The suggestion provided by many online, is to somehow force SATA speed to 1.5Gbps. However, as far as I can tell, there's no BIOS settings on this laptop to do that, and the disk itself cannot be forced either (no jumpers).

After some more research I found that the SATA speed can be forced on the kernel command-line:
  1. modify /etc/default/grub:
    GRUB_CMDLINE_LINUX="libata.force=1.5Gbps"
  2. run update-grub as root
  3. reboot
This cannot solve early boot problems, but I was hoping that it would at least prevent the lockups and hiccups.

Unfortunately, this did not work as expected - here's what I get in the logs:
ata3: FORCE: PHY spd limit set to 1.5Gbps
ata3: SATA max UDMA/133 cmd 0x30c0 ctl 0x30b4 bmdma 0x3090 irq 23

...

ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
The kernel reports its intention to force the SATA controller to 1.5Gbps, but the link comes up at 3.0Gbps regardless. Sh*t.

I still had another trick up my sleeve, but I've run out of steam here, so you'll have to wait for my next post.

Friday, February 17, 2012

Hard Disk Problems: Disabling Spindown

So I managed to upgrade my laptop's hard disk, and setup a dual-boot system with Windows 7 Ultimate and Debian/testing. It all seemed to be working just fine. I was pleased, and mildly surprised. It didn't last long, though.

The first sign of trouble was that my laptop would not come out of sleep mode ("suspend") - and this happened on both Windows and Linux. My theory, at the time, was that the antiquated SATA interface on my machine, and the new hard-disk's aggressive power management features, do not play well together. The common wisdom on the Net is that one has to prevent the OS from spinning down the hard disk when it goes to sleep.

Doing that on Windows 7 is left as an exercise to the reader. Linux, on the other hand, earns its reputation fair and square as a time-waster. You should add the following incantation to /etc/hdparm.conf:
/dev/sda {
        apm = 254
        spindown_time = 0
}
(where /dev/sda is the block device pointing to the hard-disk in question), and then restart hdparm (as root):
invoke-rc.d hdparm restart
The only trouble with this, is that it only works for the first time the system goes to sleep, and on the second time the system hangs again. This is caused by a long standing wishlist-bug in hdparm (see Ubuntu bug #199094 and Debian bug #510676), with a published fix, that has not been applied for some reason. Basically, most hdparm settings are lost upon resuming from suspend.

The fix is to create the a script /etc/pm/sleep.d/20hdparm, with the following contents:
#!/bin/sh
# This script reinitializes the hard disk settings on resume.                                                                              

case $1 in
  resume|thaw)
    /usr/sbin/invoke-rc.d hdparm start >/dev/null
    ;;
  suspend|hibernate)
    # Not needed                                                                                                                           
    ;;
esac
But even with this, I ended up completely disabling sleep/hibernation because I couldn't get the box to come out of hibernation, and I had some more hard disk trouble.

Stay tuned...

Friday, February 10, 2012

Dual Boot Windows 7 and Debian GNU/Linux

Previously on Machine-Cycle: your thrill-seeking host has decided to upgrade his laptop and setup a dual boot system, and has managed to install a vanilla Windows 7 Ultimate on a new clean hard disk, from upgrade media.

Onward and forward: time to install Debian and make this box useful.

I researched this step a bit and decided to use Windows' own boot manager to manage the selection of operating systems at boot time, so as to minimize the risk of somehow trashing the Windows 7 installation. I did, eventually, replace Windows boot manager with GRUB2, and it worked out just fine. I switched to GRUB2 because it allowed me to debug some hardware issues that cropped up later, but I'll leave that for a future post.
  1. download the Debian/testing netinst ISO image and burn it to a CD
  2. boot Debian Installer from CD
  3. select graphical install
  4. select manual partitioning: you should now see the list of existing disk partitions - the first, smaller, NTFS partition belongs to the Windows 7 boot manager, and should not be touched; the second, large, NTFS partition is the one we need to shrink, in order to fit in Debian
  5. resize the second NTFS partition listed to, say, 33 percent of the hard-drive's total size (this will take a while)
  6. select the remaining empty partition and install Debian into it (this will take a while)
  7. when prompted to install GRUB: you can let the Debian Installer perform its magic for you, and it should just work; otherwise, in order to keep the Windows boot manager: DO NOT install GRUB on the first partition (named, most likely, /dev/sda1), install grub on the new partition (/dev/sda3)
  8. complete the installation - note that you will not be able to boot into Debian just yet
  9. Windows will run a disk check upon reboot, and it should then start normally
  10. boot the system into a live CD/USB (Grml is a good choice) and copy the contents of the /dev/sda3 boot sector to a file on the second partition (mounted, for example, on /mnt/sda2):
    dd if=/dev/sda3 of=/mnt/sda2/debian.bin bs=512 count=1
  11. reboot into Windows
  12. open Command Prompt as administrator
  13. add a new entry to the boot manager's menu:
    bcdedit /create /d "Debian GNU/Linux" /application BOOTSECTOR
    this command returns the numeric ID of the new menu entry, which is used in the following steps
  14. configure the new menu entry to boot C:\debian.bin, make it the last entry in the menu, and configure the menu to timeout after 30 seconds:
    bcdedit /set {ID} device partition=C:
    bcdedit /set {ID}  path \debian.bin
    bcdedit /displayorder {ID} /addlast
    bcdedit /timeout 30
  15. (read more about bcdedit on this Microsoft TechNet article)
  16. you should now be able to select, upon reboot, to boot into either Windows 7 or Debian GNU/Linux
Going both ways ain't that simple.

Friday, February 3, 2012

Clean Install of Windows 7 with Upgrade Media

A few weeks ago I made a decision. I decided to upgrade my crappy HP Pavilion dv6000 laptop. Yup, you're quite right - that was a mistake.

My plan was to replace the laptop's internal 60GB hard-drive with a shiny new 320GB Western Digital 3.5'' hard-drive, and then setup a dual boot system with Windows 7 and Debian GNU/Linux.

After some research, I concluded that the safest way to setup a dual boot system would be to first install a vanilla Windows 7 system, check that it works right, and only then complicate matters by installing Debian.

Now, about a year and half ago, my wife purchased MS Office 2010 through her workplace, at a considerable discount. It came bundled with a Windows 7 Ultimate upgrade installation media, which remained lying in some drawer ever since, gathering dust. I was really itching to use it.

But scratching that itch required me to solve two problems: a legal one (am I allowed to do this?) and a technical one: according to the installation instructions, and unlike previous versions, the Windows 7 upgrade installer seems to require a working OS to already be installed on the target system. The prospect of first installing Windows XP and then upgrading to Windows 7, did not appeal to me one bit. A quick search provided me with a way out of this - I'd still need to perform a "double-install", but of Windows 7 only:
  1. when prompted by the Windows 7 installer, select a "custom" install - do not, at any time, enter any code, activate, or update the OS
  2. reboot when instructed to do so
  3. re-install from the same installation media, but this time "upgrade" the existing installation, and again, do not enter any code, activate, or update the OS
  4. reboot when instructed to do so
  5. activate the OS using the activation key provided with the installation media
  6. update the OS
and it seems that it's perfectly legal, as long as you do own a full version of an older Microsoft OS.

To be continued...