Bootable Software RAID-1 on running system

This is tested on Debian Squeeze, but should work on most Debian derived distributions and on other Linux distributions with small changes. I have written almost the same guide some years ago, but I wanted to add an extra disk to my QNAP NAS and decided to write it down again. Requirements; 2 identical disks

Installation

Partition and install Debian GNU/Linux onto first disk (sda in my case) if you don’t have a running system already. Configure the system and make sure everything works. Install the software raid administration tool ‘mdadm’

# apt-get install mdadm Partition second unused disk (sdb) with same layout as first disk (sda) sfdisk --dump /dev/sda | sfdisk --force /dev/sdb Reboot into single-user mode to make sure partition table is re-read and to avoid having running services when copying data.

Plan your RAID devices

We will create a RAID device for each partition. I have the following layout on my first disk:

Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          43      340992   83  Linux
/dev/sda2              43      182402  1464795137    5  Extended
/dev/sda5              43         773     5858304   83  Linux
/dev/sda6             773        1137     2928640   83  Linux
/dev/sda7            1137        1514     3022848   82  Linux swap / Solaris
/dev/sda8            1514        1562      389120   83  Linux
/dev/sda9            1562      182402  1452592128   83  Linux

This means that I will end up with 6 different RAID devices:

/dev/md0  (sda1 + sdb1)  mounted on /
/dev/md1  (sda5 + sdb5)  mounted on /usr
/dev/md2  (sda6 + sdb6)  mounted on /var
/dev/md3  (sda7 + sdb7)  used for swap
/dev/md4  (sda8 + sdb8)  mounted on /tmp
/dev/md5  (sda9 + sdb9)  mounted on /home

Create RAID devices

Before we create any RAID devices, we should change the parition type from Linux/SWAP to fd Linux raid auto. Do this with the fdisk command. My /dev/sdb partition table looks like this now:

Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          43      340992   fd  Linux raid autodetect
/dev/sdb2              43      182402  1464795137    5  Extended
/dev/sdb5              43         773     5858304   fd  Linux raid autodetect
/dev/sdb6             773        1137     2928640   fd  Linux raid autodetect
/dev/sdb7            1137        1514     3022848   fd  Linux raid autodetect
/dev/sdb8            1514        1562      389120   fd  Linux raid autodetect
/dev/sdb9            1562      182402  1452592128   fd  Linux raid autodetect

It would not work if we created the RAID devices with both involved disk drives, but we have the possibility to create a failed RAID device with only the partition from the new unused disk. We use the word ‘missing’ in place of the partition on sda. We create the first RAID device with:

mdadm --create /dev/md0 --chunk=64 --level=1 --raid-devices=2 missing /dev/sdb1

And we create the rest of the RAID devices:

mdadm --create /dev/md1 --chunk=64 --level=1 --raid-devices=2 missing /dev/sdb5
mdadm --create /dev/md2 --chunk=64 --level=1 --raid-devices=2 missing /dev/sdb6
mdadm --create /dev/md3 --chunk=64 --level=1 --raid-devices=2 missing /dev/sdb7
mdadm --create /dev/md4 --chunk=64 --level=1 --raid-devices=2 missing /dev/sdb8
mdadm --create /dev/md5 --chunk=64 --level=1 --raid-devices=2 missing /dev/sdb9

Type ‘cat /proc/mdstat‘ to see the 5 failed RAID devices:

# cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md5 : active (auto-read-only) raid1 sdb9[1] 1452590968 blocks super 1.2 [2/1] [_U]

md4 : active (auto-read-only) raid1 sdb8[1] 389108 blocks super 1.2 [2/1] [_U]

md3 : active (auto-read-only) raid1 sdb7[1] 3021812 blocks super 1.2 [2/1] [_U]

md2 : active (auto-read-only) raid1 sdb6[1] 2927604 blocks super 1.2 [2/1] [_U]

md1 : active (auto-read-only) raid1 sdb5[1] 5857268 blocks super 1.2 [2/1] [_U]

md0 : active (auto-read-only) raid1 sdb1[1] 340980 blocks super 1.2 [2/1] [_U]

unused devices:

Format, Mount and copy RAID devices

Prepare and copy the root partition (md0 mounted on /mnt)

mkfs.ext3 /dev/md0
mount /dev/md0 /mnt
cd / && find . -xdev | cpio -pm /mnt

Prepare and copy /usr partition (md1 mounted on /mnt/usr)

mkfs.ext3 /dev/md1
mount /dev/md1 /mnt/usr
cd /usr && find . -xdev | cpio -pm /mnt/usr

Prepare and copy /var partition (md2 mounted on /mnt/var)

mkfs.ext3 /dev/md2
mount /dev/md2 /mnt/var
cd /var && find . -xdev | cpio -pm /mnt/var

Prepare swap partition (md3)

mkswap /dev/md3

Prepare /tmp partition (md4 mounted on /mnt/tmp)

mkfs.ext3 /dev/md4
mount /dev/md4 /mnt/tmp
chmod 777 /mnt/tmp

Prepare /home partition (md5 mounted on /mnt/home)

mkfs.ext4 /dev/md5
mount /dev/md5 /mnt/home
cd /home && find . -xdev | cpio -pm /mnt/home

Change fstab

Edit /mnt/etc/fstab and change all devices according to the new RAID devices My /mnt/etc/fstab looks like this now

# <file system> <mount point> <type> <options> <dump> <pass>
proc /proc proc defaults 0 0

# / was on /dev/sda1 during installation
/dev/md0 / ext3 errors=remount-ro 0 1

# /home was on /dev/sda9 during installation
/dev/md5 /home ext4 defaults 0 2

# /tmp was on /dev/sda8 during installation
/dev/md4 /tmp ext3 defaults 0 2

# /usr was on /dev/sda5 during installation
/dev/md1 /usr ext3 defaults 0 2

# /var was on /dev/sda6 during installation
/dev/md2 /var ext3 defaults 0 2

# swap was on /dev/sda7 during installation
/dev/md3 none swap sw 0 0 /dev/scd0 /media/cdrom0 udf,iso9660 user,noauto 0 0 

Edit /boot/grub/grub.cfg and change root= so it points to /dev/md0.

Reboot on RAID devices

This worked the first time for me, without any console access 🙂 Run update-grub when you are logged in again.

Fix paritions types on sda

Now that we don’t use the partitions on our original system disk (sda) we can edit the partition table and set the partitions to type ‘

fd raid autodetect‘. My sda partition table looks like this now:

Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          43      340992   fd  Linux raid autodetect
/dev/sda2              43      182402  1464795137    5  Extended
/dev/sda5              43         773     5858304   fd  Linux raid autodetect
/dev/sda6             773        1137     2928640   fd  Linux raid autodetect
/dev/sda7            1137        1514     3022848   fd  Linux raid autodetect
/dev/sda8            1514        1562      389120   fd  Linux raid autodetect
/dev/sda9            1562      182402  1452592128   fd  Linux raid autodetect`

Add sda partitions to RAID devices

Now it’s time to add the unused partitions from sda to our failed raid devices.

mdadm --add /dev/md0 /dev/sda1

We can see the RAID device (md0) rebuilding onto the new disk with ‘cat /proc/mdstat‘ Add the other partitions:

mdadm --add /dev/md1 /dev/sda5
mdadm --add /dev/md2 /dev/sda6
mdadm --add /dev/md3 /dev/sda7
mdadm --add /dev/md4 /dev/sda8
madam --add /dev/md5 /dev/sda9

To watch the rebuild, run watch cat /proc/mdstat

Every 2,0s: cat /proc/mdstat Mon Dec 6 19:55:04 2010

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sda1[2] sdb1[1] 340980 blocks super 1.2 [2/2] [UU]

md1 : active raid1 sda5[2] sdb5[1] 5857268 blocks super 1.2 [2/2] [UU]

md2 : active raid1 sda6[2] sdb6[1] 2927604 blocks super 1.2 [2/1] [_U] resync=DELAYED

md3 : active raid1 sda7[2] sdb7[1] 3021812 blocks super 1.2 [2/1] [_U] resync=DELAYED

md4 : active raid1 sda8[2] sdb8[1] 389108 blocks super 1.2 [2/1] [_U] resync=DELAYED

md5 : active raid1 sda9[2] sdb9[1] 1452590968 blocks super 1.2 [2/1] [_U] [>....................] recovery = 0.6% (9617600/1452590968) finish=268.3min speed=89629K/sec

unused devices: 

Don’t reboot until all devices has rebuilt.