>Message: 11
>Subject: ProLiant DL380 Frequent Lock-ups - SCSI driver?
>Date: Tue, 30 Sep 2003 16:00:56 -0400
>From: "Maurer, Justin" <[EMAIL PROTECTED]>
>To: <[EMAIL PROTECTED]>
>Reply-To: [EMAIL PROTECTED]

>This is a multi-part message in MIME format.

>------_=_NextPart_001_01C3878D.8F57AB43
>Content-Type: text/plain;
>        charset="us-ascii"
>Content-Transfer-Encoding: quoted-printable

>Greetings,

>=20

>            We have a Compaq/HP ProLiant DL380 G3 (GigE) experiencing
>frequent lockups. The machine is a build server running RHL 8.0 (Psyche)
>with the default 2.4.18-14smp kernel (the machine is a dual Xeon), and
>has almost constant disk activity. The machine locks up consistently,
>though not on a particular interval...Usually it will go for 16-24 hours
>without a reboot, but sometimes it can be 20 minutes, sometimes it can
>be for days at a time. I can't point to anything more than the constant
>disk activity that makes me say this, but could it be an issue with the
>SCSI driver? Here is what dmesg has to say on the subject:

>=20

>=20

>SCSI subsystem driver Revision: 1.00

>kmod: failed to exec /sbin/modprobe -s -k scsi_hostadapter, errno =3D 2

>Compaq CISS Driver (v 2.4.30)

>cciss: Device 0xb178 has been found at bus 1 dev 3 func 0

>cciss: using DAC cycles

>      blocks=3D 71122560 block_size=3D 512

?      heads=3D 255, sectors=3D 32, cylinders=3D 8716

>=20

>blk: queue c042e400, I/O limit 4294967295Mb (mask 0xffffffffffffffff)

>Partition check:

> cciss/c0d0: p1 p2 p3 p4 < p5 p6 >

>Journalled Block Device driver loaded

>=20

>The machine also has two integrated BroadCom BCM5703X GigE chips (only
>one is beng used).=20

>=20

>Can anyone suggest troubleshooting techniques? Are any of these drivers
>known to be problematic?

>=20

>Thanks,

>Justin

Hi Justin,

Expect to find one of three things (one of these is a regular Compaq issue):
1) The SCSI controller firmware is out of date with the driver or another BIOS on the backplane or system board. This is a regular Compaq issue.
2) The SCSI controller is failing due to a bad chip (like a cache RAM chip) or a broken trace on the board.
3) A hard drive in the array is failing or has failed.

Compaq's issues with firmware and drivers is legendary, and that is the reason I quit using their servers. Every week a new Service Support Disk was released and I had to shut down the server, flash every BIOS and upgrade Insight Manager (if it was running Windows). Were it not for this issue, I would say that they have some of the best hardware on the market and their support is great. But on Compaq, all the BIOS code must be at the same revision level (thus they call come together in an SSD) and then the driver must work with the BIOS level. You never know if it will work until you try it, a bad proposition on a production server.

Also, if the SCSI controller you are using didn't come with the system, determine if the current SSD contains a BIOS flash for it, flash everything, and check for a later driver.



Tom

Thomas S. Fortner
Burleson, Texas
[EMAIL PROTECTED]
"but we preach Christ crucified..."  1 Corinthians 1:23

Reply via email to