On Sun, Sep 02, 2007 at 04:07:30PM +1000, Nathan O wrote: [snip trouble re raid5 and kernel panics]
> it spent too long resyncing and all is well. The degraded array is > mounted and working fine. I erased and created an ext3 partition on > the suspect drive and data is being copied to it as I type. I don't > think this is a hardware problem, issues only happen if I add this > drive to the array and leave it to resync for a couple of minutes. > > What should I try from here short of purchasing new hardware? I've > included some of the different messages from my kernel log if they are > of any use. I have a couple of ideas: First to the drive itself. Ensure that you have smartmontools and run a manual long test. Then, since most of a filesystem's blocks are empty, it wont' put a drive to the same work as syncing it into a raid array. To simulate this without using the raid5 kernel stuff, I would run wipe -k on the whole drive. Yes it will take a long time, but it will thorougly exercise every block of the drive; any errors should show up in syslog. Then run mke2fs -c on it to do a badblocks scan while making an ext2 (not 3, you don't need a journal) filesystem. Then run e2fsck -c -c on it to do a read/write/read test. Finally, run a long SMART test again. If after all this there are no errors or kernel panics, you can trust the drive. Then to the raid5 issue. Over the last day or three, there was a thread on debian-user about the problems with raid5 itself (not any mention of kernel bugs). Review it. Then determine if you could switch to raid1 or raid10 +/- LVM. From the error messages you supplied, I'm guessing that raid5 uses different kernel modules than raid1 and raid0. If there truely is a bug in the kernel raid5 code, then getting away from it would seem to be prudent. As always, I hope you have solid, reliable backups. Doug. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]