----- "Chris Samuel" <csam...@vpac.org> wrote:

The compute nodes are using SuperMicro H8DM8-2 based
with 32GB of ECC RAM.

Hi Chris,

I had MCE crashes on a Supermicro system (quad Xeon quad-core 2.4 Ghz) that was driving me nuts for quite a while. It would take a couple of months to crash which doesn't sound bad but it was a real pain. I bought the machine from ASL and they worked with Supermicro to fix a microcode issue.

The reason I mention this is that at least in this case, the BIOS version was same before I ran the update and after.

Here is part of a message I got from ASL:

Note that you will be updating the BIOS from version 1.0b to 1.0b. In Supermicro wisdom, they released several updates using the same revision number.

After updating my 1.0b BIOS to the new 1.0b the machine has been running solid since Christmas.

So, if you have two machines, one that crashes and one that doesn't, check the dates of the BIOS's even if the BIOS versions match.

I hope this helps.

Steve
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to