So the secondary hard drive in my kids’ Linux game computer died, with the click of death. Sigh. Fortunately, I had built it with software RAID1 and could easily get it fixed, right?
Well, it was my first time. So no it wasn’t that easy. Actually, it was way easier than I had any right to expect, but I had to read some stuff.
The most annoying thing? The Linux Documentation Project Software RAID HOWTO does not tell you how to add a device.
OK, on with the story. So I found out which drive was failing (I didn’t label which was the master and slave, so I used the very technical method of turning it off and unplugging each until the clicking stopped) and then removed it. I found another drive laying around that was about as big… 6448mb instead of 6449. Oh, for want of one megabyte… but little did I know.
So I got it hooked up and then booted. I made my way into ‘cfdisk’ and couldn’t see the second disk. I had a hissy until I realized I needed ‘cfdisk /dev/hdb’ and then it was there (boy I can be dumb)…
I partitioned the disk as closely as I could to the same sizes, but I couldn’t quite make them match… the sectors were counted differently across manufacturers. Had I been smart in the original build, I would have put in mismatched disks so I wouldn’t then depend on an exact sector match. Oh well.
Then, I used the information in a md RAID reconstruction thread and a Debian Sarge RAID1 article to get the mirrors re-added…
mdadm –add /dev/md0 /dev/hdb1
mdadm –add /dev/md2 /dev/hdb3
but mdadm –add /dev/md1 /dev/hdb2 didn’t work, since the partition was off by just a bit. Thankfully that is my swap partition (ok so I’m not always dumb) and it will just have to run without a mirror. If the machine goes down and loses the first hard drive, I’ll reconstruct the swap partition in a new md device and the we’ll be fine even with a new, different-size drive. The handy tip offered was watching the reconstruction with:
watch -n 6 cat /proc/mdstat
which made me feel better because I could see it being fixed.
Thanks to the folks who wrote up their experiences… I now have a working computer (and did I mention I really am glad the first machine to fail was the kids’ computer?) again and will not be as concerned the next time I lose a drive.
Technorati Tags: debian, software+raid