Hi,

thanks for your reply!

On Thursday 28 September 2006 16:02, you wrote:
> I bet if you decode the MCE it will say uncorrectable ECC memory error.

You'd win that bet.

> memtest86 doesn't see correctable memory errors.

As far as I can remember, memtest86 includes tests that also detect
correctable ECC errors.

> It sounds like you have a pile of correctable (soft?) memory errors that
> occasionally become uncorrectable.

Yes, we have. But about 75% of our nodes never showed correctable ECC errors.
And some of them crashed. On the other side we have nodes with a bunch of
correctable ECC errors that have been stable since the first day.

Cheers, Thomas
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to