Public bug reported:

Some software RAID arrays fail to start on boot.  Exactly two of my
arrays (but not always the same two!) do not start, on every single
boot, and I have done 24 boots since I started taking detailed notes.

Have been running Ubuntu 12.04 with latest updates.  Two days ago I
selectively upgraded mdadm to 3.2.5 from -proposed, as suggested in bug
#942106;  that upgrade helped some other people, but not me.  Over the
last few months, various updates in kernel and mdadm have resulted in
great improvement of symptoms, but no complete cure so far.

Note that the following symptoms once regularly occurred on this system, but 
have NOT occurred in the past few weeks:
  - Having to wait for a degraded array to resync
  - Having to manually re-attach a component (usually a spare) that had become 
detached
  - Having to drop to the command line to zero a superblock before reattaching 
a component
  - Having an array containing swap fail to start
  - Having to use anything other than Disk Utility to get arrays running 
properly again


This system has six SATA drives on two controllers.  It contains seven RAID 
arrays, including RAID 1, RAID 10, and RAID 6;  all are listed in fstab.  Some 
use 0.90.0 metadata and some use 1.2 metadata.  The root filesystem is not on a 
RAID array (at least not any more;  I got tired of that REAL fast) but 
everything else (including /boot and all swap) is on RAID.  One array is used 
for /boot, two for swap, and the other four are just there for testing purposes.

BOOT_DEGRADED is set.  All partitions are GPT.  Not using LUKS or LVM.
All drives are 2TB and by various manufacturers, and I suspect some have
512B physical sectors and some have 2KB sectors.  This is an AMD64
system with 8GB RAM.


This system has had about four different versions of Ubuntu on it over the last 
few years, and has had multiple RAID arrays on it from the beginning.  (This is 
why some of the arrays are still using 0.90.0 metadata, and why there are so 
many arrays;  some arrays are old partitions containing root and home and such 
from earlier incarnations.)  RAID worked fine until the system was upgraded to 
Oneiric early in 2012 (no, the problem did not start with Precise).

I have carefully tested the system every time an updated kernel or mdadm
has appeared, ever since the problem started.  The behavior has
gradually improved over the last several months.  This latest proposed
version of mdadm (3.2.5), thankfully, did not result in regressions, but
also did not result in significant improvement on this system;  have
rebooted five times since then and the behavior is consistent.


When the problem first started, on Oneiric, I had the root file system on RAID. 
 This was unpleasant.  I stopped using the system for a while, as I had another 
one running Maverick, which was reliable.

When I noticed some discussion of possibly related bugs on the Linux
RAID list (I've been lurking there for years) I decided to test the
system some more.  By then Precise was out, so I upgraded.  That did not
help.  Eventually I backed up all data onto another system and did a
clean install of Precise on a non-RAID partition, which made the system
tolerable.  I left /boot on a RAID1 array (on all six drives), but that
does not prevent the system from booting even if /boot does not start
during Ubuntu startup (I assume because GRUB can find /boot even if
Ubuntu later can't).

I started taking detailed notes in May (seven cramped pages so far).
Have rebooted 24 times since then.  On every boot, exactly two arrays
did not start.  Which arrays they were, varied from boot to boot;  could
be any of the arrays (but recently, swap arrays are not affected).  No
apparent correlation with metadata type or RAID level.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: mdadm 3.2.5-1ubuntu0.2
ProcVersionSignature: Ubuntu 3.2.0-29.46-generic 3.2.24
Uname: Linux 3.2.0-29-generic x86_64
ApportVersion: 2.0.1-0ubuntu12
Architecture: amd64
Date: Mon Aug 13 12:10:36 2012
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Release amd64 
(20120425)
MDadmExamine.dev.sda:
 /dev/sda:
    MBR Magic : aa55
 Partition[0] :   3907029167 sectors at            1 (type ee)
MDadmExamine.dev.sda1: Error: command ['/sbin/mdadm', '-E', '/dev/sda1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda1.
MDadmExamine.dev.sda11: Error: command ['/sbin/mdadm', '-E', '/dev/sda11'] 
failed with exit code 1: mdadm: No md superblock detected on /dev/sda11.
MDadmExamine.dev.sda4: Error: command ['/sbin/mdadm', '-E', '/dev/sda4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda4.
MDadmExamine.dev.sda5: Error: command ['/sbin/mdadm', '-E', '/dev/sda5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda5.
MDadmExamine.dev.sda6: Error: command ['/sbin/mdadm', '-E', '/dev/sda6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda6.
MDadmExamine.dev.sda7: Error: command ['/sbin/mdadm', '-E', '/dev/sda7'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sda7.
MDadmExamine.dev.sdb:
 /dev/sdb:
    MBR Magic : aa55
 Partition[0] :   3907029167 sectors at            1 (type ee)
MDadmExamine.dev.sdb1: Error: command ['/sbin/mdadm', '-E', '/dev/sdb1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdb1.
MDadmExamine.dev.sdb11: Error: command ['/sbin/mdadm', '-E', '/dev/sdb11'] 
failed with exit code 1: mdadm: No md superblock detected on /dev/sdb11.
MDadmExamine.dev.sdb4: Error: command ['/sbin/mdadm', '-E', '/dev/sdb4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdb4.
MDadmExamine.dev.sdb5: Error: command ['/sbin/mdadm', '-E', '/dev/sdb5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdb5.
MDadmExamine.dev.sdb6: Error: command ['/sbin/mdadm', '-E', '/dev/sdb6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdb6.
MDadmExamine.dev.sdb7: Error: command ['/sbin/mdadm', '-E', '/dev/sdb7'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdb7.
MDadmExamine.dev.sdc:
 /dev/sdc:
    MBR Magic : aa55
 Partition[0] :   3907029167 sectors at            1 (type ee)
MDadmExamine.dev.sdc1: Error: command ['/sbin/mdadm', '-E', '/dev/sdc1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdc1.
MDadmExamine.dev.sdc4: Error: command ['/sbin/mdadm', '-E', '/dev/sdc4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdc4.
MDadmExamine.dev.sdc5: Error: command ['/sbin/mdadm', '-E', '/dev/sdc5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdc5.
MDadmExamine.dev.sdc6: Error: command ['/sbin/mdadm', '-E', '/dev/sdc6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdc6.
MDadmExamine.dev.sdc7: Error: command ['/sbin/mdadm', '-E', '/dev/sdc7'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdc7.
MDadmExamine.dev.sdd:
 /dev/sdd:
    MBR Magic : aa55
 Partition[0] :   3907029167 sectors at            1 (type ee)
MDadmExamine.dev.sdd1: Error: command ['/sbin/mdadm', '-E', '/dev/sdd1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdd1.
MDadmExamine.dev.sdd4: Error: command ['/sbin/mdadm', '-E', '/dev/sdd4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdd4.
MDadmExamine.dev.sdd5: Error: command ['/sbin/mdadm', '-E', '/dev/sdd5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdd5.
MDadmExamine.dev.sdd6: Error: command ['/sbin/mdadm', '-E', '/dev/sdd6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdd6.
MDadmExamine.dev.sdd7: Error: command ['/sbin/mdadm', '-E', '/dev/sdd7'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdd7.
MDadmExamine.dev.sde: Error: command ['/sbin/mdadm', '-E', '/dev/sde'] failed 
with exit code 1: mdadm: cannot open /dev/sde: No medium found
MDadmExamine.dev.sdf:
 /dev/sdf:
    MBR Magic : aa55
 Partition[0] :   3907029167 sectors at            1 (type ee)
MDadmExamine.dev.sdf1: Error: command ['/sbin/mdadm', '-E', '/dev/sdf1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdf1.
MDadmExamine.dev.sdf11: Error: command ['/sbin/mdadm', '-E', '/dev/sdf11'] 
failed with exit code 1: mdadm: No md superblock detected on /dev/sdf11.
MDadmExamine.dev.sdf4: Error: command ['/sbin/mdadm', '-E', '/dev/sdf4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdf4.
MDadmExamine.dev.sdf5: Error: command ['/sbin/mdadm', '-E', '/dev/sdf5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdf5.
MDadmExamine.dev.sdf6: Error: command ['/sbin/mdadm', '-E', '/dev/sdf6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdf6.
MDadmExamine.dev.sdf7: Error: command ['/sbin/mdadm', '-E', '/dev/sdf7'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdf7.
MDadmExamine.dev.sdg:
 /dev/sdg:
    MBR Magic : aa55
 Partition[0] :   3907029167 sectors at            1 (type ee)
MDadmExamine.dev.sdg1: Error: command ['/sbin/mdadm', '-E', '/dev/sdg1'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdg1.
MDadmExamine.dev.sdg11: Error: command ['/sbin/mdadm', '-E', '/dev/sdg11'] 
failed with exit code 1: mdadm: No md superblock detected on /dev/sdg11.
MDadmExamine.dev.sdg4: Error: command ['/sbin/mdadm', '-E', '/dev/sdg4'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdg4.
MDadmExamine.dev.sdg5: Error: command ['/sbin/mdadm', '-E', '/dev/sdg5'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdg5.
MDadmExamine.dev.sdg6: Error: command ['/sbin/mdadm', '-E', '/dev/sdg6'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdg6.
MDadmExamine.dev.sdg7: Error: command ['/sbin/mdadm', '-E', '/dev/sdg7'] failed 
with exit code 1: mdadm: No md superblock detected on /dev/sdg7.
MachineType: System manufacturer System Product Name
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-29-generic 
root=UUID=9035fd0f-c11b-405f-82b1-875ecf527582 ro quiet splash vt.handoff=7
SourcePackage: mdadm
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 10/08/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 2701
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: M3A78-EM
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Asset-1234567890
dmi.chassis.type: 3
dmi.chassis.vendor: Chassis Manufacture
dmi.chassis.version: Chassis Version
dmi.modalias: 
dmi:bvnAmericanMegatrendsInc.:bvr2701:bd10/08/2010:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKComputerINC.:rnM3A78-EM:rvrRevX.0x:cvnChassisManufacture:ct3:cvrChassisVersion:
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer

** Affects: mdadm (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: amd64 apport-bug precise

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1036366

Title:
  software RAID arrays fail to start on boot

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1036366/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to