Package: mdadm
Version: 1.8.1-1
Severity: grave
Justification: causes data loss

Hi,

I created a device md0 w/ 3 block devices; hdf1, hdg1, and hdh1.  It was
set to use raid5.  Today, hdg1 disappeared from the machine.
Unfortunately, mdadm 1.8.1 does not appear to be able to assemble md
devices in degraded mode:

[EMAIL PROTECTED]:/dev$ sudo mdadm -A /dev/md0 --scan
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
in kernel logs:
md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: md0 stopped.
md: bind<hdh1>
md: bind<hdf1>
md: md0: raid array is not clean -- starting background
reconstruction
raid5: measuring checksumming speed
   8regs     :  1352.000 MB/sec
   8regs_prefetch:  1132.000 MB/sec
   32regs    :   924.000 MB/sec
   32regs_prefetch:   916.000 MB/sec
  pII_mmx   :  2388.000 MB/sec
   p5_mmx    :  3204.000 MB/sec
raid5: using function: p5_mmx (3204.000 MB/sec)
md: raid5 personality registered as nr 4
raid5: device hdf1 operational as raid disk 0
raid5: device hdh1 operational as raid disk 2
raid5: cannot start dirty degraded array for md0
RAID5 conf printout:
 --- rd:3 wd:2 fd:1
 disk 0, o:1, dev:hdf1
 disk 2, o:1, dev:hdh1
raid5: failed to run raid set md0
md: pers->run() failed ...
[EMAIL PROTECTED]:/dev$ sudo mdadm -E /dev/hdh1
/dev/hdh1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 47c630a9:142ec28b:a28fb6b2:8fc0ce54
  Creation Time : Mon Feb 24 05:17:28 2003
     Raid Level : raid5
    Device Size : 156288256 (149.05 GiB 160.04 GB)
   Raid Devices : 3
  Total Devices : 2
Preferred Minor : 0
    Update Time : Tue Jan 25 20:46:04 2005
          State : dirty
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : c12cdab5 - correct
         Events : 0.1697073
         Layout : left-symmetric
     Chunk Size : 32K
      Number   Major   Minor   RaidDevice State
this     2      34       65        2      active sync
/dev/ide/host2/bus1/target1/lun0/part1
   0     0      33       65        0      active sync
/dev/ide/host2/bus0/target1/lun0/part1
   1     1       0        0        1      faulty removed
   2     2      34       65        2      active sync
/dev/ide/host2/bus1/target1/lun0/part1


Google informed me that I should try the following:

[EMAIL PROTECTED]:/dev$ sudo mdadm -S /dev/md0
[EMAIL PROTECTED]:/dev$ sudo mdadm -Af /dev/md0 /dev/hdf1 /dev/hdh1
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
[EMAIL PROTECTED]:/dev$ sudo mdadm -S /dev/md0
[EMAIL PROTECTED]:/dev$ sudo mdadm -Af --scan /dev/md0 /dev/hdf1 /dev/hdh1
mdadm: failed to RUN_ARRAY /dev/md0: Invalid argument
mdadm: /dev/hdf1 not identified in config file.
mdadm: /dev/hdh1 not identified in config file.

I then downgraded mdadm 1.7.0-1, and:

[EMAIL PROTECTED]:~$ sudo mdadm -S /dev/md0 
[EMAIL PROTECTED]:~$  sudo mdadm -Af /dev/md0 /dev/hdf1 /dev/hdh1
mdadm: /dev/md0 has been started with 2 drives (out of 3).


According to Neil Brown, mdadm 1.8.1 is a development version
(http://www.issociate.de/board/post/141215/Multipath_problem.html);
he recommends 1.8.0.  If this is indeed the case, we should
definitely not be releasing sarge w/ this version.  The severity of this
bug is grave because the whole point of running w/ raid5 is so that
if a drive fails, data can be recovered from the remaining 2 drives.
Not being able to assemble the md device to recover such data is a bit
of a show stopper.



-- System Information:
Debian Release: 3.1
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.6.10-1-k7
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to