I have been trying to solve this problem without success for about a year now.
My hardware is: - Dell NetPlex 486SX/25 - 32MB RAM - AHA1542C SCSI Controller - 3COM 509B Ethernet Controller - 2-port 16550A Serial Card - 3 x SCSI Hard Drives External - 1 x SCSI Jaz Drive External - 1 x SCSI Sony DDS-3 DAT Drive External My software is: - Debian GNU/Linux 2.1 (slink) - Glibc 2.1 added from potato - Linux 2.2.11 kernel My problem is: After the system has been up for a random length of time (usually about a week or so) it will crash in the middle of the night during a full backup to the DAT drive using cpio. The machine hangs in either an infinite loop or a kernel panic. I originally was running Debian 2.1 with a 2.0.36 kernel, and I would see the following scrolling endlessly off the screen after a crash: Sending SCSI DID_RESET... Sending SCSI DID_RESET... Sending SCSI DID_RESET... Sending SCSI DID_RESET... Sending SCSI DID_RESET... other scsi messages, etc... Since installing the 2.2 kernel and associated upgraded packages as detailed in the errata for slink, the crashes *seem* to occur less often, but this morning I saw: aha1542_out failed... aha1542_out failed... failed to reset target... ... Kernel panic: unable to find empty mailbox for aha1542... and the system was locked up. Since upgrading to the 2.2 kernel, I also notice periodic messages in the syslog (about one per day) like this: aha1542.c: interrupt received but no mail The system will run perfectly for a week or so, doing this same backup routine every night, and then it will just pull this trick on some random night. I have tried: - disconnecting all devices except the tape drive hard drives - installing the highest quality cables I can find for the external devices (this machine currently has about $400 US worth of Granite Digital cables hanging off of it). - installing a Granite Digital active terminator on the end of the SCSI chain - verifying that there are no interrupt or IO port confilicts both in the device jumper configurations and from the /proc filesystem I am completely at my wits end with this. I have searched DejaNews repeatedly for any discussions of kernel panics and crashes with Adaptec cards, Linux, SCSI in general, etc., and all I can find is one thread from about a year ago mentioning the same sorts of problems but no solution. Is this a problem that anyone else has ever had with Linux and an AHA1542C in particular or SCSI in general? Can anyone recommend which part of the setup I should change or eliminate? Is it a bad card? Are Adaptec cards bad in general? Is the aha1542 scsi driver problematic? Is Linux SCSI in general problematic? _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com