On 5/25/26 04:16, Andy Smith wrote:
Hi,


Hello.  :-)


The only machines I have ECC RAM in also have Machine Check Exception
messages go to a log in the firmware. I have experienced running
memtest86 for a few successful complete passes and then finding messages
in the firmware log about things that were corrected, which enabled me
to locate the bad stick.

Also, most of the time I've had RAM fail it's done so in a way that ECC
can't fix, because ECC can only correct a single bit flip.

So, I have continued ti find memtest86 useful although it would be a bit
less so if I had no way to see the MCE log.


It sounds like I need to learn about Debian's "collectd-core" module and mcelog(8) (?).


David

Reply via email to