Public bug reported:

After upgrading from 11.04 to 12.04 in two steps, my server failed to
boot printing:

"Could not start the RAID in degraded mode.", referring to /dev/md/3.
Then dropping to an initramfs-shell.

My RAID setup is the following:

# cat /proc/mdstat 
Personalities : [raid6] [raid5] [raid4] [raid1] [linear] [multipath] [raid0] 
[raid10] 
md3 : active raid0 dm-2[0] sdc2[2] sdb2[1] sdd2[3]
      82075648 blocks super 1.2 1024k chunks
      
md0 : active raid1 sdf1[1] sde1[0]
      530048 blocks [2/2] [UU]
      
md4 : active raid5 sdf3[1] sdh3[4] sdg3[2] sde3[0]
      5856021120 blocks super 1.2 level 5, 128k chunk, algorithm 2 [4/4] [UUUU]
      
md2 : active raid6 sdh2[3] sdf2[1] sdg2[2] sde2[0]
      1950720 blocks super 1.2 level 6, 512k chunk, algorithm 2 [4/4] [UUUU]
      
md1 : active raid5 sda1[0] sdc1[2] sdb1[1] sdd1[3]
      11712000 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]

The following are my mount points:

# mount
/dev/mapper/md1_crypt on / type ext4 (rw,noatime,errors=remount-ro)
/dev/md0 on /boot type ext4 (rw)
/dev/md3 on /something/else/irrelevant type xfs (rw,discard,discard)

# grep -e md -e mapper -e boot /etc/fstab
/dev/mapper/md1_crypt /               ext4    noatime,errors=remount-ro 0       
1
UUID=9b199b09-078a-4f88-82dc-a099be4c6a09 /boot           ext4    defaults      
  0       2
/dev/mapper/md2_crypt none            swap    sw              0       0
/dev/mapper/md4_crypt /mnt ext4 noatime,defaults 0 0
/dev/md3 /something/else/irrelevant xfs defaults,discard 0 0

Current crypttab setup:

# cat /etc/crypttab 
md1_crypt /dev/md1 none luks,discard
md2_crypt /dev/md2 /dev/urandom cipher=aes-cbc-essiv:sha256,size=256,swap
md3_crypt /dev/md3 /some/key/file cipher=aes-cbc-essiv:sha256,size=256
sda2_crypt /dev/sda2 /some/key/file cipher=aes-cbc-essiv:sha256,size=256,discard
md4_crypt /dev/md4 /some/key/file cipher=aes-cbc-essiv:sha256,size=256

As you can see the first fact is that /dev/md3 should not be relevant
for booting the system. It's not the rootfs, it's not the swap, it's not
/boot. Which is all I need to get my system up and running.

The part #2 which you can find as a contributor to the problem is that
/dev/md3 is a RAID0 (0 drive tolerance for fault) which includes a
device which is initiated in the crypttab (sda2_crypt). So once slice of
the /dev/md3 is encrypted.

During boot, this is what happens:

1) System mounts the initrd stuff (which has a local derrivation of
fstab, mdadm.conf and crypttab). It tries to determine what to do. It
determines the system has an encrypted rootfs, and correctly prompt for
the password.

2) /dev/mapper/md1_crypt is unlocked from /dev/md1. /dev/md1 is
assembled at this point, and operational.

3) The system moves on trying to determine how to assemble the rest of
the raids. It reads mdadm.conf (the problem persists even though I
remove this file, although then my md3 is named md127). It finds
definitions of md0, md1, md2, md3 & md4. It will try to run the stuff
from /usr/share/initramfs-tools/hooks/mdadm.

4) /usr/share/initramfs-tools/hooks/mdadm runs before the rest of the
encrypted devices are assembled. Which kind of makes sense, as the
encrypted devices may actually be on a raid. However, md3 consists of
chunks from raw block devices and a device which is derived from the
crypttab. The hook utilizes /usr/share/initramfs-tools/scripts/mdadm-
functions.

md3 : active raid0 dm-2[0] sdc2[2] sdb2[1] sdd2[3]
      82075648 blocks super 1.2 1024k chunks

Notice the first device.

5) For some reason, even though adding BOOT_DEGRADED=true in
/etc/default/mdadm it will ignore this for a degraded RAID0, as it is
probably marked as faulty and not degraded?

6) The system halts. Throws me into the initramfs-shell.


I got the system successfully booting by "hacking" the mdadm-functions
file:


--- usr/share/initramfs-tools/scripts/mdadm-functions   2012-02-10 
04:04:54.000000000 +0100
+++ /usr/share/initramfs-tools/scripts/mdadm-functions  2012-10-02 
23:55:08.246402544 +0200
@@ -3,8 +3,9 @@
 
 degraded_arrays()
 {
-       mdadm --misc --scan --detail --test >/dev/null 2>&1
-       return $((! $?))
+#      mdadm --misc --scan --detail --test >/dev/null 2>&1
+   return 0
+#      return $((! $?))
 }
 
 mountroot_fail()
@@ -83,10 +84,11 @@
                                        echo "Started the RAID in degraded 
mode."
                                        return 0
                                else
+               mdadm --stop /dev/md3
                                        echo "Could not start the RAID in 
degraded mode."
                                fi
                        fi
                fi
        fi
-       return 1
+       return 0
 }


So basically I force mdadm-functions to always return 0, and never check for 
degraded arrays. In addition I make it stop the faulty assembled /dev/md3 which 
will be re-assembled after the initramfs completes anyhow. 

This setup was working in 11.04.

Lucky me having a remote serial console to actually solve it... :)

The setup should be quite reproducible along any 12.04 setup.

** Affects: mdadm (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: encryption mdadm raid

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1062159

Title:
  Raid is incorrectly determined as DEGRADED preventing boot in 12.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1062159/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to