Damn, I really must learn how to use the BTS properly... resending to the BTS.
As mentioned in the earlier email, I flashed the RAID card to the latest Dell firmware. The server bailed last night. From kern.log on a remote logging host:
Jan 25 00:54:47 tardis kernel: aacraid: Host adapter reset request. SCSI hang ? Jan 25 00:54:47 tardis kernel: scsi: device set offline - command error recover failed: host 0 channel 0 id 0 lun 0 Jan 25 00:54:47 tardis kernel: I/O error: dev 08:06, sector 1506680 Jan 25 00:54:47 tardis kernel: SCSI disk error : host 0 channel 0 id 0 lun 0 return code = 6000000 Jan 25 00:54:47 tardis kernel: I/O error: dev 08:06, sector 1953513 Jan 25 00:54:47 tardis kernel: I/O error: dev 08:06, sector 1953520 Jan 25 00:54:47 tardis kernel: I/O error in filesystem ("sd(8,6)") meta-data dev sd(8,6) block 0x1dcee9 ("xlog_iodone") error 5 buf count 3584 Jan 25 00:54:47 tardis kernel: I/O error: dev 08:06, sector 1506680 Jan 25 00:54:47 tardis kernel: xfs_force_shutdown(sd(8,6),0x2) called from line 959 of file xfs_log.c. Return address = 0xf89274aa
$ lspci -vvv
00:04.0 RAID bus controller: Digital Equipment Corporation DECchip 21554 (rev 01) Subsystem: Adaptec Dell PowerEdge RAID Controller 2 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop- ParErr+ Stepping- SERR+ FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- Latency: 128, cache line size 08 Interrupt: pin A routed to IRQ 61 Region 0: Memory at f4000000 (32-bit, non-prefetchable) [size=8K] Region 1: I/O ports at 1000 [size=256] Expansion ROM at <unassigned> [disabled] [size=128K] Capabilities: <available only to root>
$ cat /proc/scsi/aacraid/0
Adaptec Raid Controller 1.1-3 Nov 14 2004 11:01:31, scsi hba number 0 kernel: 2.8-4[6089] monitor: 2.8-4[6089] bios: 2.8-0[6089] serial: 895e87fafaf001
$ uname -a Linux tardis 2.4.27-ainet-p3-smp #1 SMP Sun Nov 14 10:54:19 GMT 2004 i686 GNU/Linux
Built from kernel-source-2.4.27 (2.4.27-5).
I'll build a kernel from the latest debian 2.4.27 (2.4.27-8) and reboot this evening, though bear in mind that the server has taken up to ~40 days to crash in the past.
Yell if you need more info.
Ronny -- Technical Director Amazing Internet Ltd, London t: +44 20 8607 9535 f: +44 20 8607 9536 w: www.amazinginternet.com
-- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]