zargony.com

#![desc = "Random thoughts of a software engineer"]

Assembling a partitionable software-RAID with mdadm: Device or resource busy?

This weekend, the error message "device or resource busy" almost drove me crazy when I was trying to assemble (activate) a software RAID array.

I was experimenting with a software RAID-1 setup (two mirrored disks) using VirtualBox before planning to deploy it to a live server. Because I never liked the idea of creating 4 partitions on each disk and mirror them with individual arrays (/dev/md0 to /dev/md3), I planned to create one partitionable array that spans the whole disks (/dev/md_d0). I.e. the one and only array /dev/md_d0 consists of two whole disks /dev/sda and /dev/sdb. The advantage: the second drive is an exact copy of the first one -- even partition table and master boot record are identically -- just like with a hardware RAID. So in emergency cases, you can even use one of the disks directly, without activating RAID drivers.

So I created a virtual machine with two harddrives to test the setup. Inside the VM, I booted using a LiveCD. Setting up the software RAID was straight forward (mdadm --create /dev/md_d0 --auto=mdp --level=1 --raid-devices=2 /dev/sda /dev/sdb) and it worked out of the box... So I partitioned it and created filesystems on it. However, when I rebooted the VM and tried to activate (assemble) the array (mdadm --assemble --scan), it didn't work. "Device or resource busy" was the error message I always got. This error message usually means, that the device cannot be used because it is already used, usually because a filesystem is currently mounted.

However, I was absolutely sure that no filesystem was mounted from that disk, since neither mount nor lsof showed anything suspicious.

Well, after quite some time, I found out that after more than 15 years of experience with Linux administration, I fell prey to a beginner's mistake: Not only filesystems are mounted... there's also swap space on disks...

In my case, the LiveCD detected a usable swap space on each disk and automatically used it. Actually the swap partition was meant to be used on the array as /dev/md_d0p2 (second partition of the array), but since the array uses the whole disks sda and sdb, each disk also contains the partition table and therefore the LiveCD detected and used the swap space.

So as a note to myself and everybody who also encounters unexplainable "device or resource busy" errors: Don't forget to check the swap space :)

In the end, this problem lead me to considerations, if it wouldn't be better to create partitions for RAID data (type FD) and use them instead of the whole disks, so that a booting system wouldn't detect usable partitions without assembling the array. However, I had troubles installing a boot loader in such a configuration. Unfortunately, I couldn't get neither LILO nor GRUB to boot the kernel.