Package:mdadm Version:v1.7.0 - 11 August 2004
Distribution: Debian Sarge Kernel: 2.6.8-1-386 Hardware: AMD64 2800+/ MSI K8N Neo with NVIDEA nForce3 250Gb Chipset Hard disk: 2 x Maxtor 7Y250P0 (250GB) IDE Software: mdadm - v1.7.0 - 11 August 2004 RAID1 hda and hdc
Hello!
Problem Description: The RAID1 system was originally tested by simulating individual drive failures. This was achieved by disconnecting each drive in turn and running force-fail under 'mdadm' using the --set-faulty option. Everything worked as expected.
Then a real disk fault occurred! A system monitoring tool reported that '/dev/hda7' had unreadable (pending) sectors. Later, I tested the suspect hard disk with a maxtor drive test utility and this confirmed a genuine unrecoverable disk read error. The 'md' driver had tried to switch at partition '/dev/hdc7', but the whole 'hdc' disk was not accessible because of a "dma_timer_expiry" error. After the faulty 'hda' hard drive was replaced, the system rebooted, and RAID1 sync'ed [??] again, everything works fine again. A system log and an mdstat dump are appended.
It would seem that RAID1 failed to detect and report this error as would be expected in the circumstances. Is this a correct assessment ? Do you have any comments or advice in this case ? Cheers and any thanks in advance.
Peter Sahlmann Network Administrator
---------------------------------- Log from /var/log/kern.log ----------------------------------
Jan 20 22:00:41 andros kernel: hda: dma_intr: status=0x51 { DriveReady
SeekComplete Error }
Jan 20 22:00:41 andros kernel: hda: dma_intr: error=0x40 {
UncorrectableError }, LBAsect=423121589, high=25, low=3691189,
sector=423121589
Jan 20 22:00:41 andros kernel: end_request: I/O error, dev hda, sector
423121589
Jan 20 22:00:41 andros kernel: raid1: Disk failure on hda7, disabling device.
Jan 20 22:00:41 andros kernel: ^IOperation continuing on 1 devices
Jan 20 22:00:41 andros kernel: raid1: hda7: rescheduling sector 227803256
Jan 20 22:00:41 andros kernel: raid1: hdc7: redirecting sector 227803256
to another mirror
Jan 20 22:01:02 andros kernel: hdc: dma_timer_expiry: dma status == 0x60
Jan 20 22:01:02 andros kernel: hdc: DMA timeout retry
Jan 20 22:01:02 andros kernel: hdc: timeout waiting for DMA
Jan 20 22:01:02 andros kernel: hdc: status error: status=0x58 {
DriveReadySeekComplete DataRequest }
Jan 20 22:01:02 andros kernel:
Jan 20 22:01:02 andros kernel: hdc: drive not ready for command
Jan 20 22:01:02 andros kernel: raid1: hdc7: rescheduling sector 227803256
Jan 20 22:01:02 andros kernel: raid1: hdc7: redirecting sector 227803256
to anothermirror
Jan 20 22:01:02 andros kernel: hdc: status error: status=0x58 {
DriveReadySeekComplete DataRequest }
Jan 20 22:01:02 andros kernel:
Jan 20 22:01:02 andros kernel: hdc: drive not ready for command
Jan 20 22:01:02 andros kernel: hdc: status error: status=0x58 {
DriveReadySeekComplete DataRequest }
---------------------------------- mdstat dump ----------------------------------
andros:~# cat /proc/mdstat
Personalities : [raid1] md0 : active raid1 hda1[0] hdc1[1] 979840 blocks [2/2] [UU]
md3 : active raid1 hda6[0] hdc6[1] 48829440 blocks [2/2] [UU]
md4 : active raid1 hda7[0] hdc7[1] 147452480 blocks [2/2] [UU]
md1 : active raid1 hda2[0] hdc2[1] 1951808 blocks [2/2] [UU]
md2 : active raid1 hda5[0] hdc5[1] 45897600 blocks [2/2] [UU]
----------------------------------- /etc/fstab -----------------------------------
# /etc/fstab: static file system information. # # <file system> <mount point> <type> <options> <dump> <pass> proc /proc proc defaults 0 0 /dev/md2 / ext3 defaults,errors=remount-ro 0 1 /dev/md0 /boot ext3 defaults 0 2 /dev/md4 /files ext3 nosuid 0 2 /dev/md3 /home ext3 defaults 0 2 /dev/md1 none swap sw 0 0 /dev/hdd /media/cdrom iso9660 ro,user,noauto 0 0 /dev/fd0 /media/floppy0 auto rw,user,noauto 0 0 /dev/sda1 /mnt/usbstick vfat rw,user,noauto,umask=000 0 0
-- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]