troy's unix space: Breaking and Syncing an MD Root Mirror

Here's one of those times where I thought, "I wonder", as in I wonder if
I can break an MD root mirror and sanely recreate it. You can do so with
other software RAID solutions so why not here? Well, you can. There is
nothing short about this procedure, taking 80+ commands to accomplish.
Then again, if you are getting ready to do something to a host that
is potentially going to wreck the root disk, breaking the mirror first
can quickly get you back up and running if things go awry. Given the
amount of detail that follows, I've broken things up into two sections,
COMMANDS and DETAILED, with NOTES at the end. COMMANDS only provides the
commands necessary without any details or discussion, whereas DETAILED
shows everything involved. Our host details for this are:

        HOST:                   tux
        PROMPTS:                [tux [0] |sh-3.2# ]
        OS:                     CentOS 5.4 Linux
        MIRRORS:                md1 (rootfs (/)), md2 (varfs (/var)), md3 (SWAP-md3 (swap))
        MD1 COMPONENTS:         sda1, sdb1
        MD2 COMPONENTS:         sda2, sdb2
        MD3 COMPONENTS:         sda3, sdb3
        EXISTING FSTAB:
            LABEL=rootfs            /                       ext3    defaults        1 1
            LABEL=varfs             /var                    ext3    defaults        1 2
            tmpfs                   /dev/shm                tmpfs   defaults        0 0
            devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
            sysfs                   /sys                    sysfs   defaults        0 0
            proc                    /proc                   proc    defaults        0 0
            LABEL=SWAP-md3          swap                    swap    defaults        0 0
        EXISTING GRUB.CONF:
            default=2
            timeout=5
            title CentOS d1 (2.6.18-164.el5)
                    root (hd0,0)
                    kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                    initrd /boot/initrd-2.6.18-164.el5.img
            title CentOS d2 (2.6.18-164.el5)
                    root (hd1,0)
                    kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                    initrd /boot/initrd-2.6.18-164.el5.img
            title CentOS RAID1 d1 (2.6.18-164.el5)
                    root (hd0,0)
                    kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                    initrd /boot/initrd-2.6.18-164.el5-raid.img
            title CentOS RAID1 d2 (2.6.18-164.el5)
                    root (hd1,0)
                    kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                    initrd /boot/initrd-2.6.18-164.el5-raid.img

COMMANDS

        tux [0] /sbin/grub-install /dev/sda
        tux [0] /sbin/grub-install /dev/sdb
        tux [0] /bin/sync
        tux [0] /sbin/mdadm /dev/md1 --fail /dev/sda1 --remove /dev/sda1
        tux [0] /sbin/mdadm /dev/md2 --fail /dev/sda2 --remove /dev/sda2
        tux [0] /sbin/mdadm /dev/md3 --fail /dev/sda3 --remove /dev/sda3
        tux [0] /bin/cat /proc/mdstat
        tux [0] /sbin/mdadm --zero-superblock /dev/sda1
        tux [0] /sbin/mdadm --zero-superblock /dev/sda2
        tux [0] /sbin/mdadm --zero-superblock /dev/sda3
        tux [0] for i in sda1 sda2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        tux [0] [ ! -d /a ] && /bin/mkdir /a
        tux [1] /bin/mount /dev/sda1 /a
        tux [0] /bin/mount /dev/sda2 /a/var
        tux [0] /bin/cp /a/etc/fstab /a/etc/fstab.md
        ## update /a/etc/fstab with sda values
        tux [0] /bin/cp /a/boot/grub/grub.conf /a/boot/grub/md.grub.conf
        tux [0] /bin/sed -e 's;default=2;default=0;' -e '1,/CentOS d2/ s;root=LABEL=rootfs;root=/dev/sda1;' /a/boot/grub/md.grub.conf > /a/boot/grub/grub.conf
        tux [0] /bin/sed -e 's;default=2;default=3;' /a/boot/grub/md.grub.conf > /boot/grub/grub.conf
        ## if necessary for your updates, mount pseudo FS and chroot to /a
        ## tux [0] /bin/mount -o bind /proc /a/proc
        ## tux [0] /bin/mount -o bind /dev /a/dev
        ## tux [0] /bin/mount -o bind /sys /a/sys
        ## tux [0] /usr/sbin/chroot /a
        ## willitstay is simply to illustrate updates
        tux [0] echo "I wonder if this will stay" >> /a/willitstay
        tux [0] /bin/cat /a/willitstay
        tux [0] /bin/umount /a/var
        tux [0] /bin/umount /a
        tux [0] for i in sda1 sda2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        tux [0] /sbin/reboot
        ## boot to disk 1 (sda, hd0), verify your updates, remove MD mirror (see note 1)
        tux [0] /bin/cat /proc/cmdline
        tux [0] /bin/df -h
        tux [0] /sbin/swapon -s
        tux [0] /bin/cat /willitstay
        tux [0] /bin/cat /proc/mdstat
        tux [0] /sbin/mdadm --stop /dev/md1
        tux [0] /sbin/mdadm --stop /dev/md2
        tux [0] /sbin/mdadm --stop /dev/md3
        tux [0] /sbin/mdadm --remove /dev/md1
        tux [0] /sbin/mdadm --remove /dev/md2
        tux [0] /sbin/mdadm --remove /dev/md3
        tux [0] /sbin/mdadm --zero-superblock /dev/sdb1
        tux [0] /sbin/mdadm --zero-superblock /dev/sdb2
        tux [0] /sbin/mdadm --zero-superblock /dev/sdb3
        tux [0] /bin/cat /proc/mdstat
        tux [0] for i in sdb1 sdb2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        tux [0] [ ! -d /a ] && /bin/mkdir /a
        tux [1] /bin/mount /dev/sdb1 /a
        tux [0] /bin/cp /a/boot/grub/grub.conf /a/boot/grub/md.grub.conf
        tux [0] /bin/sed -e 's;default=3;default=1;' -e '/CentOS d2/,/RAID1 d1/ s;root=LABEL=rootfs;root=/dev/sdb1;' /a/boot/grub/md.grub.conf > /a/boot/grub/grub.conf
        tux [0] /bin/sed -e 's;sda;sdb;g' /etc/fstab > /a/etc/fstab
        tux [0] /bin/umount /a
        tux [0] /sbin/fsck -y /dev/sdb1
        tux [0] /sbin/reboot
        ## boot to disk 2 (sdb, hd1), reconstruct MD mirror on sda
        sh-3.2# /bin/df -h
        sh-3.2# /sbin/swapon -s
        sh-3.2# /sbin/mdadm -C /dev/md1 --level=raid1 --raid-device=2 /dev/sda1 missing
        sh-3.2# /sbin/mdadm -C /dev/md2 --level=raid1 --raid-device=2 /dev/sda2 missing
        sh-3.2# /sbin/mdadm -C /dev/md3 --level=raid1 --raid-device=2 /dev/sda3 missing
        sh-3.2# for i in md1 md2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        sh-3.2# [ ! -d /a ] && /bin/mkdir /a
        sh-3.2# /bin/mount /dev/md1 /a
        ## update /a/etc/fstab using md values instead of sda values
        ## update /a/boot/grub/grub.conf "RAID1 d1" entry with "root=/dev/md1"
        sh-3.2# echo "DEVICE /dev/sda* /dev/sdb*" > /a/etc/mdadm.conf
        sh-3.2# /sbin/mdadm --detail --scan >> /a/etc/mdadm.conf
        sh-3.2# /bin/umount /dev/md1
        sh-3.2# /sbin/fsck -y /dev/md1
        sh-3.2# /usr/bin/reboot
        ## boot to disk 1 (sda, hd0), add sdb into recreated mirrors
        tux [0] /bin/df -h
        tux [0] /sbin/swapon -s
        tux [0] /bin/cat /proc/mdstat
        tux [0] /bin/cat /willitstay 
        tux [0] /sbin/mdadm --add /dev/md1 /dev/sdb1
        tux [0] /sbin/mdadm --add /dev/md2 /dev/sdb2
        tux [0] /sbin/mdadm --add /dev/md3 /dev/sdb3 
        tux [0] /bin/cat /proc/mdstat
        tux [0] /sbin/mdadm --detail /dev/md1
        tux [0] /bin/cat /etc/fstab
        tux [0] grep -v ^# /boot/grub/grub.conf
        tux [0] > /etc/blkid/blkid.tab
        tux [0] /sbin/blkid
        tux [0] /bin/cat /proc/mdstat

DETAILED

Before breaking our root mirror, ensure that both sides of the array
are bootable:

        tux [0] /sbin/grub-install /dev/sda
        Installation finished. No error reported. 
        This is the contents of the device map /boot/grub/device.map.
        Check if this is correct or not. If any of the lines is incorrect,
        fix it and re-run the script `grub-install'.

        # this device map was generated by anaconda
        (hd0)     /dev/sda
        (hd1)     /dev/sdb
        tux [0] /sbin/grub-install /dev/sdb
        Installation finished. No error reported.
        This is the contents of the device map /boot/grub/device.map.
        Check if this is correct or not. If any of the lines is incorrect,
        fix it and re-run the script `grub-install'.

        # this device map was generated by anaconda
        (hd0)     /dev/sda
        (hd1)     /dev/sdb

Now, using 'sync', flush all data to disk and subsequently break the
root mirror to free 'sda' from MD control:

        tux [0] /bin/sync
        tux [0] /sbin/mdadm /dev/md1 --fail /dev/sda1 --remove /dev/sda1
        mdadm: set /dev/sda1 faulty in /dev/md1
        mdadm: hot removed /dev/sda1
        tux [0] /sbin/mdadm /dev/md2 --fail /dev/sda2 --remove /dev/sda2
        mdadm: set /dev/sda2 faulty in /dev/md2
        mdadm: hot removed /dev/sda2
        tux [0] /sbin/mdadm /dev/md3 --fail /dev/sda3 --remove /dev/sda3
        mdadm: set /dev/sda3 faulty in /dev/md3
        mdadm: hot removed /dev/sda3
        tux [0] /bin/cat /proc/mdstat
        Personalities : [raid1]
        md2 : active raid1 sdb2[1]
              2096384 blocks [2/1] [_U]

        md3 : active raid1 sdb3[1]
              1052160 blocks [2/1] [_U]

        md1 : active raid1 sdb1[1]
              7590592 blocks [2/1] [_U]

        unused devices: <none>
        tux [0] /sbin/mdadm --zero-superblock /dev/sda1
        tux [0] /sbin/mdadm --zero-superblock /dev/sda2
        tux [0] /sbin/mdadm --zero-superblock /dev/sda3
        tux [0] /sbin/mdadm --examine /dev/sda1 /dev/sda2 /dev/sda3
        mdadm: No md superblock detected on /dev/sda1.
        mdadm: No md superblock detected on /dev/sda2.
        mdadm: No md superblock detected on /dev/sda3.

Yes, even zero out the MD superblock. Otherwise, MD will try to recover
the 'sda' devices later. Check the filesystems on 'sda' for any issues,
and mount them under '/a':

        tux [1] for i in sda1 sda2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        ** sda1 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        rootfs: recovering journal
        rootfs: clean, 71374/950272 files, 562967/1897648 blocks
        ** sda2 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        varfs: recovering journal
        varfs: clean, 1008/262144 files, 26510/524096 blocks
        tux [0] [ ! -d /a ] && /bin/mkdir /a
        tux [1] /bin/mount /dev/sda1 /a
        tux [0] /bin/mount /dev/sda2 /a/var

Since we will be booting to 'sda' in a little bit, backup '/a/etc/fstab'
and update it with the 'sda' devices. (Volume labels will not work
as intended at the moment.) While we are in the backup mood, backup
'/a/boot/grub/grub.conf', and update it setting a new "default" value
and updating the first "kernel" line to mount root from '/dev/sda1':

        tux [0] /bin/cp /a/etc/fstab /a/etc/fstab.md
        tux [0] /bin/cat << EOF > /a/etc/fstab
        /dev/sda1               /                       ext3    defaults        1 1
        /dev/sda2               /var                    ext3    defaults        1 2
        tmpfs                   /dev/shm                tmpfs   defaults        0 0
        devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
        sysfs                   /sys                    sysfs   defaults        0 0
        proc                    /proc                   proc    defaults        0 0
        /dev/sda3               swap                    swap    defaults        0 0
        tux [0] /bin/cp /a/boot/grub/grub.conf /a/boot/grub/md.grub.conf
        tux [0] /bin/sed -e 's;default=2;default=0;' -e '1,/CentOS d2/ s;root=LABEL=rootfs;root=/dev/sda1;' /a/boot/grub/md.grub.conf > /a/boot/grub/grub.conf
        tux [0] /bin/sed -e 's;default=2;default=3;' /a/boot/grub/md.grub.conf > /boot/grub/grub.conf

Above, we also updated the 'grub.conf' on '/dev/md1' to default to booting
from disk 2 (sdb1, hd1) with RAID availability, just in case our 'sda'
devices are unusable. At this point, 'sda' is completely free from MD
control, mounted to '/a' and available for us to perform any intended
updates or changes. If we need to work with 'sda' as the root device,
mount up our pseudo FS and 'chroot' to '/a':

        tux [0] /bin/mount -o bind /proc /a/proc
        tux [0] /bin/mount -o bind /dev /a/dev
        tux [0] /bin/mount -o bind /sys /a/sys
        tux [0] /usr/sbin/chroot /a

The following 'echo' to '/a/willitstay' is simply to illustrate an update
on 'sda' that doesn't exist on 'md' (thus 'sdb'):

        tux [0] echo "I wonder if this will stay" >> /a/willitstay
        tux [0] /bin/cat /a/willitstay
        I wonder if this will stay

Once the updates are complete, unmount the 'sda' FS and run 'fsck'
to ensure FS stability. After that, reboot the host:

        tux [0] /bin/umount /a/var
        tux [0] /bin/umount /a
        tux [0] for i in sda1 sda2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        ** sda1 ** 
        fsck 1.39 (29-May-2006) 
        e2fsck 1.39 (29-May-2006)
        rootfs: clean, 71377/950272 files, 562970/1897648 blocks
        ** sda2 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        varfs: clean, 1008/262144 files, 26510/524096 blocks
        tux [0] /sbin/reboot

Once the BIOS is past POST, boot from disk 1. If GRUB doesn't default
to the non-RAID disk 1 (sda, hd0) entry, navigate to it and boot it:

         GNU GRUB  version 0.97  (639K lower / 1047488K upper memory)

        CentOS d1 (2.6.18-164.el5)                <==================
        CentOS d2 (2.6.18-164.el5)
        CentOS RAID1 d1 (2.6.18-164.el5)
        CentOS RAID1 d2 (2.6.18-164.el5)


           Use the <up> and <down> keys to select which entry is highlighted.
           Press enter to boot the selected OS, 'e' to edit the 
           commands before booting, 'a' to modify the kernel arguments
           before booting, or 'c' for a command-line.

        The highlighted entry will be booted automatically in 5 seconds.

        <snip...>

          Booting 'CentOS d1 (2.6.18-164.el5)'

        root (hd0,0)
         Filesystem type is ext2fs, partition type 0xfd
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/sda1
           [Linux-bzImage, setup=0x1e00, size=0x1c31b4] 
        initrd /boot/initrd-2.6.18-164.el5.img
           [Linux-initrd @ 0x37d73000, 0x27c402 bytes]
        <snip...>

With the system back online in runlevel 3, we verify that we are booted
from 'sda1' and that we are only using the 'sda' devices. We can also
see that our update took hold (see note 1):

        tux [0] /bin/cat /proc/cmdline 
        ro root=/dev/sda1
        tux [0] /bin/df -h 
        Filesystem            Size  Used Avail Use% Mounted on
        /dev/sda1             7.2G  2.1G  4.8G  31% /
        /dev/sda2             2.0G   72M  1.8G   4% /var
        tmpfs                 506M     0  506M   0% /dev/shm
        tux [0] /sbin/swapon -s
        Filename                                Type            Size    Used    Priority
        /dev/sda3                               partition       1052152 0       -1
        tux [0] /bin/cat /willitstay
        I wonder if this will stay

If we intend to keep our updates and all things are functioning as
expected, we can proceed with destroying the existing MD devices and
freeing 'sdb' from MD control:

        tux [0] /bin/cat /proc/mdstat
        Personalities : [raid1]
        md1 : active raid1 sdb1[1]
              7590592 blocks [2/1] [_U]

        md2 : active raid1 sdb2[1]
              2096384 blocks [2/1] [_U]

        md3 : active raid1 sdb3[1]
              1052160 blocks [2/1] [_U]

        unused devices: <none>
        tux [0] /sbin/mdadm --stop /dev/md1
        mdadm: stopped /dev/md1
        tux [0] /sbin/mdadm --stop /dev/md2
        mdadm: stopped /dev/md2
        tux [0] /sbin/mdadm --stop /dev/md3
        mdadm: stopped /dev/md3
        tux [0] /sbin/mdadm --remove /dev/md1
        tux [0] /sbin/mdadm --remove /dev/md2
        tux [0] /sbin/mdadm --remove /dev/md3
        tux [0] /sbin/mdadm --zero-superblock /dev/sdb1
        tux [0] /sbin/mdadm --zero-superblock /dev/sdb2
        tux [0] /sbin/mdadm --zero-superblock /dev/sdb3

Again, we need to remove the MD superblock so MD doesn't try to recover
the mirrors. After checking that there are no more root MD devices or
mirrors, use 'fsck' to check the 'sdb' FS:

        tux [0] /bin/cat /proc/mdstat
        Personalities : [raid1]
        unused devices: <none>
        tux [0] for i in sdb1 sdb2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        ** sdb1 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        rootfs: clean, 71374/950272 files, 562968/1897648 blocks
        ** sdb2 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        varfs: clean, 976/262144 files, 26514/524096 blocks

At this point, we need to make some temporary updates to 'sdb1'
before we reboot. This is to allow us to recreate the MD mirrors
using 'sda', thus retaining our updates. Mount 'sdb1' to '/a', backup
'/a/etc/boot/grub/grub.conf' and update it setting the "default" to our
non-RAID disk 2 (sdb, hd1). Also update the kernel line for "title 1"
to mount the root FS from '/dev/sdb1'. After this, update '/a/etc/fstab'
to mount the 'sdb' partitions:

        tux [0] [ ! -d /a ] && /bin/mkdir /a
        tux [1] /bin/mount /dev/sdb1 /a
        tux [0] /bin/cp /a/boot/grub/grub.conf /a/boot/grub/md.grub.conf
        tux [0] /bin/sed -e 's;default=3;default=1;' -e '/CentOS d2/,/RAID1 d1/ s;root=LABEL=rootfs;root=/dev/sdb1;' /a/boot/grub/md.grub.conf > /a/boot/grub/grub.conf
        tux [0] /bin/sed -e 's;sda;sdb;g' /etc/fstab > /a/etc/fstab
        tux [0] [0] /bin/ls /willitstay
        /bin/ls: /willitstay: No such file or directory
        tux [2] /bin/umount /a
        tux [0] /sbin/fsck -y /dev/sdb1
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        rootfs: clean, 71375/950272 files, 562969/1897648 blocks
        tux [0] /sbin/reboot

After our file updates and verification that 'sdb1' doesn't contain the
updates on 'sda1', we unmounted 'sdb1' from '/a' and ran another 'fsck'
against it, followed by a reboot. Once the host resets, after the BIOS
POST, boot from disk 2 (sdb, hd1), and edit the boot option to boot into
single user:

         GNU GRUB  version 0.97  (639K lower / 1047488K upper memory)

        CentOS d1 (2.6.18-164.el5) 
        CentOS d2 (2.6.18-164.el5)                <==================
        CentOS RAID1 d1 (2.6.18-164.el5)
        CentOS RAID1 d2 (2.6.18-164.el5)


           Use the <up> and <down> keys to select which entry is highlighted.
           Press enter to boot the selected OS, 'e' to edit the
           commands before booting, 'a' to modify the kernel arguments
           before booting, or 'c' for a command-line.

        The highlighted entry will be booted automatically in 5 seconds.

        <snip...>

         GNU GRUB  version 0.97  (639K lower / 1047488K upper memory)

        root (hd1,0)
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/sdb1
        initrd /boot/initrd-2.6.18-164.el5.img


           Use the <up> and <down> keys to select which entry is highlighted.
           Press 'b' to boot, 'e' to edit the selected command in the
           boot sequence, 'c' for a command-line, 'o' to open a new line
           after ('O' for before) the selected line, 'd' to remove the
           selected line, or escape to go back to the main menu.

    update from this:

        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/sdb1

    to this:

        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/sdb1 -s

    If the kernel shows:

        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs

    update it to:

        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/sdb1 -s

With our "kernel" entry updated, press enter, followed by 'b' to boot:

          Booting command-list

        root (hd1,0)
         Filesystem type is ext2fs, partition type 0xfd
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/sdb1 -s
           [Linux-bzImage, setup=0x1d00, size=0x1c31b4]
        initrd /boot/initrd-2.6.18-164.el5.img
           [Linux-initrd @ 0x37d73000, 0x27c402 bytes]

        <snip...>
        Setting hostname tux:
        mdadm: no devices found for /dev/md3
        mdadm: no devices found for /dev/md2
        mdadm: no devices found for /dev/md1
        No devices found
        <snip...>
        Enabling /etc/fstab swaps:

We get the mdadm messages above because we left /etc/mdadm.conf in
place, containing the original MD configuration. It's not a problem.
After we are in "single user", we do a quick check to ensure that we
are only using the 'sdb' devices:

        sh-3.2# /bin/df -h
        Filesystem            Size  Used Avail Use% Mounted on
        /dev/sdb1             7.2G  2.1G  4.8G  31% /
        /dev/sdb2             2.0G   72M  1.8G   4% /var
        tmpfs                 506M     0  506M   0% /dev/shm
        sh-3.2# /sbin/swapon -s
        Filename                                Type            Size    Used   Priority
        /dev/sdb3                               parition        1052152 0      -1

We can now recreate our mirrors on 'sda' using 'mdadm':

        sh-3.2# /sbin/mdadm -C /dev/md1 --level=raid1 --raid-device=2 /dev/sda1 missing
        mdadm: /dev/sda1 appears to contain an ext2fs file system
            size=7590592K mtime=Fri Jan 28 22:38:50 2011
        Continue creating array? y
        mdadm: array /dev/md1 started.
        sh-3.2#

When prompted to continue, enter 'y'. The issue above can be safely
ignored (see note 2). In creating 'md2', we see the same situation which
can be handled in the same manner. After creating 'md3', run 'fsck' on
'md1' and 'md2' to validate our FS on the 'sda' devices:

        sh-3.2# /sbin/mdadm -C /dev/md2 --level=raid1 --raid-device=2 /dev/sda2 missing
        mdadm: /dev/sda2 appears to contain an ext2fs file system
            size=2096384K mtime=Fri Jan 28 22:38:58 2011
        Continue creating array? y
        mdadm: array /dev/md2 started.
        sh-3.2# /sbin/mdadm -C /dev/md3 --level=raid1 --raid-device=2 /dev/sda3 missing
        mdadm: array /dev/md3 started.
        sh-3.2# for i in md1 md2 ; do echo "** ${i} **" ; /sbin/fsck -y /dev/${i} ; done
        ** md1 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        rootfs: clean, 71377/950272 files, 562970/1897648 blocks
        ** md2 **
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        varfs: clean, 977/262144 files, 26510/524096 blocks

Once again, we need to update 'grub.conf' and 'fstab', so mount 'md1' to
'/a' and update '/a/etc/fstab' setting 'md' devices for our partitions.
Update '/a/etc/grub/grub.conf setting "RAID1 d1" to mount the root FS from
'/dev/md1':

        sh-3.2# [ ! -d /a ] && /bin/mkdir /a
        sh-3.2# /bin/mount /dev/md1 /a
        sh-3.2# /bin/cat /a/etc/fstab
        /dev/md1                /                       ext3    defaults        1 1
        /dev/md2                /var                    ext3    defaults        1 2
        tmpfs                   /dev/shm                tmpfs   defaults        0 0
        devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
        sysfs                   /sys                    sysfs   defaults        0 0
        proc                    /proc                   proc    defaults        0 0
        /dev/md3                swap                    swap    defaults        0 0
        sh-3.2# /bin/grep -v ^# /a/boot/grub/grub.conf
        default=2
        timeout=5
        title CentOS d1 (2.6.18-164.el5)
                root (hd0,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5.img
        title CentOS d2 (2.6.18-164.el5)
                root (hd1,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5.img
        title CentOS RAID1 d1 (2.6.18-164.el5)  
                root (hd0,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=/dev/md1
                initrd /boot/initrd-2.6.18-164.el5-raid.img
        title CentOS RAID1 d2 (2.6.18-164.el5)
                root (hd1,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5-raid.img

Since we've recreated our mirrors, the UUIDs in '/a/etc/mdadm.conf'
are no longer valid. Below we see our current 'mdadm.conf', update it,
and verify our updates with the new UUIDs in place:

        sh-3.2# /bin/cat /a/etc/mdadm.conf
        DEVICE /dev/sda* /dev/sdb*
        ARRAY /dev/md3 level=raid1 num-devices=2 metadata=0.90 UUID=5a13f19a:e513990a:dcb63c9e:8eb86163
        ARRAY /dev/md2 level=raid1 num-devices=2 metadata=0.90 UUID=2d89bff8:1cb58a4e:5ce76bbc:aee4d045
        ARRAY /dev/md1 level=raid1 num-devices=2 metadata=0.90 UUID=e06caa1e:8f51d4a3:0fec0e68:d0421884
        sh-3.2# echo "DEVICE /dev/sda* /dev/sdb*" > /a/etc/mdadm.conf
        sh-3.2# /sbin/mdadm --detail --scan >> /a/etc/mdadm.conf
        sh-3.2# /bin/cat /a/etc/mdadm.conf
        DEVICE /dev/sda* /dev/sdb*
        ARRAY /dev/md3 level=raid1 num-devices=2 metadata=0.90 UUID=f36de614:73bab81f:7e814783:d2309b7e
        ARRAY /dev/md2 level=raid1 num-devices=2 metadata=0.90 UUID=1f104883:17648c0d:e7b0b02a:ef57bbf2
        ARRAY /dev/md1 level=raid1 num-devices=2 metadata=0.90 UUID=db3f433d:a463cd31:044dc85c:b1fffc8c

Unmount 'md1' and 'fsck' it. Follow up with a reboot and boot from disk 1
(sda (hd0)), specifically, we will have GRUB boot the RAID enabled entry,
CentOS RAID1 d1 (2.6.18-164.el5):

        sh-3.2# /bin/umount /dev/md1
        sh-3.2# /sbin/fsck -y /dev/md1
        fsck 1.39 (29-May-2006)
        e2fsck 1.39 (29-May-2006)
        rootfs: clean, 71375/950272 files, 562968/1897648 blocks
        sh-3.2# /usr/bin/reboot

        # boot disk 1

         GNU GRUB  version 0.97  (639K lower / 1047488K upper memory)

        CentOS d1 (2.6.18-164.el5)
        CentOS d2 (2.6.18-164.el5)
        CentOS RAID1 d1 (2.6.18-164.el5)          <==================
        CentOS RAID1 d2 (2.6.18-164.el5)


           Use the <up> and <down> keys to select which entry is highlighted.
           Press enter to boot the selected OS, 'e' to edit the
           commands before booting, 'a' to modify the kernel arguments
           before booting, or 'c' for a command-line.

        The highlighted entry will be booted automatically in 5 seconds.

        <snip...>

          Booting 'CentOS RAID1 d1 (2.6.18-164.el5)'

        root (hd0,0)
         Filesystem type is ext2fs, partition type 0xfd
        kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
           [Linux-bzImage, setup=0x1e00, size=0x1c31b4]
        initrd /boot/initrd-2.6.18-164.el5-raid.img
           [Linux-initrd @ 0x37d71000, 0x27ee0e bytes]
        <snip...>

After we are back to runlevel 3, we check that we are using the newly
recreated mirrors and that our updates are still in place:

        tux [0] /bin/df -h
        Filesystem            Size  Used Avail Use% Mounted on
        /dev/md1              7.2G  2.1G  4.8G  31% /
        /dev/md2              2.0G   72M  1.8G   4% /var
        tmpfs                 506M     0  506M   0% /dev/shm
        tux [0] /sbin/swapon -s
        Filename                                Type            Size    Used    Priority
        /dev/md3                                partition       1052152 0       -1
        tux [0] /bin/cat /proc/mdstat
        Personalities : [raid1]
        md2 : active raid1 sda2[0]
              2096384 blocks [2/1] [U_]

        md3 : active raid1 sda3[0]
              1052160 blocks [2/1] [U_]

        md1 : active raid1 sda1[0]
              7590592 blocks [2/1] [U_]

        unused devices: <none>
        tux [0] /bin/cat /willitstay
        I wonder if this will stay

At this point, add the 'sdb' partitions back into the mirrors:

        tux [0] /sbin/mdadm --add /dev/md1 /dev/sdb1
        mdadm: added /dev/sdb1
        tux [0] /sbin/mdadm --add /dev/md2 /dev/sdb2
        mdadm: added /dev/sdb2
        tux [0] /sbin/mdadm --add /dev/md3 /dev/sdb3
        mdadm: added /dev/sdb3

You can check the status of the mirrors either by reading '/proc/mdstat'
or running 'mdadm' against each 'md' device:

        tux [0] /bin/cat /proc/mdstat
        Personalities : [raid1]
        md2 : active raid1 sdb2[2] sda2[0]
              2096384 blocks [2/1] [U_]
                resync=DELAYED

        md3 : active raid1 sdb3[2] sda3[0]
              1052160 blocks [2/1] [U_]
                resync=DELAYED 

        md1 : active raid1 sdb1[2] sda1[0]
              7590592 blocks [2/1] [U_]
              [==>..................]  recovery = 11.8% (896896/7590592) finish=6.1min speed=18225K/sec

        unused devices: <none>
        tux [0] /sbin/mdadm --detail /dev/md1
        /dev/md1: 
                Version : 0.90
          Creation Time : Fri Jan 28 23:17:33 2011
             Raid Level : raid1
             Array Size : 7590592 (7.24 GiB 7.77 GB)
          Used Dev Size : 7590592 (7.24 GiB 7.77 GB)
           Raid Devices : 2
          Total Devices : 2
        Preferred Minor : 1
            Persistence : Superblock is persistent

            Update Time : Sat Jan 29 00:05:16 2011
                  State : clean, degraded, recovering
         Active Devices : 1
        Working Devices : 2
         Failed Devices : 0
          Spare Devices : 1

         Rebuild Status : 19% complete

                   UUID : db3f433d:a463cd31:044dc85c:b1fffc8c
                 Events : 0.140

            Number   Major   Minor   RaidDevice State
               0       8        1        0      active sync   /dev/sda1
               2       8       17        1      spare rebuilding   /dev/sdb1

Alright, final steps. If you like using "volume labels", update
'/etc/fstab' appropriately:

        tux [0] /bin/cat /etc/fstab
        LABEL=rootfs            /                       ext3    defaults        1 1
        LABEL=varfs             /var                    ext3    defaults        1 2
        tmpfs                   /dev/shm                tmpfs   defaults        0 0
        devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
        sysfs                   /sys                    sysfs   defaults        0 0
        proc                    /proc                   proc    defaults        0 0
        LABEL=SWAP-md3          swap                    swap    defaults        0 0

Also, update /boot/grub/grub.conf to use "volume labels" for the root FS:

        tux [0] grep -v ^# /boot/grub/grub.conf
        default=2
        timeout=5
        title CentOS d1 (2.6.18-164.el5)
                root (hd0,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5.img
        title CentOS d2 (2.6.18-164.el5)
                root (hd1,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5.img
        title CentOS RAID1 d1 (2.6.18-164.el5)
                root (hd0,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5-raid.img
        title CentOS RAID1 d2 (2.6.18-164.el5)
                root (hd1,0)
                kernel /boot/vmlinuz-2.6.18-164.el5 ro root=LABEL=rootfs
                initrd /boot/initrd-2.6.18-164.el5-raid.img

After updating 'fstab' and 'grub.conf', we need to update the block device
cachefile. While most of it is accurate, through the above it couldn't
seem to retain the entry for 'md1' (see note 3). The simplest way to
do this is to zero out the existing file and repopulate it with 'blkid':

        tux [0] > /etc/blkid/blkid.tab
        tux [0] /sbin/blkid
        /dev/sda1: LABEL="rootfs" UUID="50e1eeb8-d726-43ca-a918-c7102ae9aa7e" SEC_TYPE="ext2" TYPE="ext3"
        /dev/sda2: LABEL="varfs" UUID="590fbc90-e94b-4f4b-ad2d-b7b61798ff80" SEC_TYPE="ext2" TYPE="ext3"
        /dev/sda3: LABEL="SWAP-md3" TYPE="swap"
        /dev/sdb1: LABEL="rootfs" UUID="50e1eeb8-d726-43ca-a918-c7102ae9aa7e" SEC_TYPE="ext2" TYPE="ext3"
        /dev/sdb2: LABEL="varfs" UUID="590fbc90-e94b-4f4b-ad2d-b7b61798ff80" SEC_TYPE="ext2" TYPE="ext3"
        /dev/sdb3: LABEL="SWAP-md3" TYPE="swap"
        /dev/md1: LABEL="rootfs" UUID="50e1eeb8-d726-43ca-a918-c7102ae9aa7e" TYPE="ext3"
        /dev/md3: LABEL="SWAP-md3" TYPE="swap"
        /dev/md2: LABEL="varfs" UUID="590fbc90-e94b-4f4b-ad2d-b7b61798ff80" TYPE="ext3"

Once your mirrors have finished resyncing, everything is complete:

        tux [0] /bin/cat /proc/mdstat
        Personalities : [raid1]
        md2 : active raid1 sdb2[1] sda2[0]
              2096384 blocks [2/2] [UU]

        md3 : active raid1 sdb3[1] sda3[0]
              1052160 blocks [2/2] [UU]

        md1 : active raid1 sdb1[1] sda1[0]
              7590592 blocks [2/2] [UU]

        unused devices: <none>

NOTES

Note 1: If our updates are problematic, simply reboot the host, tell
    the BIOS to boot from disk 2, and GRUB to load the RAID aware entry
    (RAID1 d2) of disk 2 (sdb, hd1). Once the host is back online,
    run the following to resync the original mirrors to their pre-update
    state:

        tux [0] /sbin/mdadm --add /dev/md1 /dev/sda1
        tux [0] /sbin/mdadm --add /dev/md2 /dev/sda2
        tux [0] /sbin/mdadm --add /dev/md3 /dev/sda3

Note 2: The issue shown isn't really an issue. For us to use MD devices,
    we must leave about 128Kb of space at the end of the drive for the
    MD RAID superblock. Here, mdadm is simply noticing that there is
    alreay an FS on the slice in question. Since the 'sda' devices
    were previously part of an MD array and filesystems created on the prior
    'md' devices, each 'sda' device already has 128Kb left at the end
    so there should be no concerns of wrecking your existing volumes.

Note 3: If '/etc/blkid/blkid.tab' doesn't have an entry for 'md1', at the
    next reboot the system will drop you to "single user" with either 'sda1'
    or 'sdb1' mounted read only on '/'. This is because we had '/etc/fstab'
    mounting '/' from 'md1'. Updating 'blkid.tab' resolves this issue.
    If 'blkid.tab' is empty, running 'blkid' will both output to STDOUT and
    populate 'blkid.tab'.

see also:
    Creating an MD Root Mirror in Linux
    Disk Cloning in Linux
    Breaking and Syncing an SVM Root Mirror

29 January 2011

Breaking and Syncing an MD Root Mirror