On AMD sensors at least the reading is a 'relative value' with 70C
indicating an overheat.

The processor is fine unless it is actually clocking its self down to
a lower ACPI power state. It is seemingly impossible to overheat
modern CPUS. When its clocking down you should see messages in
/var/log/messages and also the ipmi SEL log.

Dont trust IPMI. Ever. lm_sensors actually reads the raw value from
the CPU and requires a specifically written kernel module to do so.
Who knows what kind of junk math the ipmi does.

Supermicro IPMI implementations are particularly bad at reporting temp
properly(like really awful).

Most of my experience here is with AMD. ymmv :)



2012/8/9 John Hearns <hear...@googlemail.com>:
> Well, I don't use lm_sensors for a start!
> Use the ipmitool utility to probe the readings from BMC cards (ILO,
> DRAC, they're the same thing).
> I don;t trust the absolute calibration of the sensors - generally
> you're looking at setting a limit on which to alarm or shutdown so
> just take a reading under no load on the CPU and call that the
> 'normal' reading.
> I may be wrong. YMMV.
>
> On 08/08/2012, Vincent Diepeveen <d...@xs4all.nl> wrote:
>> hi,
>>
>> How do you guys monitor the CPU core temperatures?
>>
>> if i run lm_sensors, it's 30C higher at every node than a few nodes i
>> tried compare with windows.
>> Also under full load it reports temperatures like end 60s and up to
>> 78C i've seen reported.
>> Am guessing it should be 30-40+ at most.
>>
>> It blows cool air from and outside the cpu's. Nothing is even 'warm'.
>>
>> Nodes here: supermicro X7DWE inside Xeons L5420. They are not
>> overclocked.
>>
>> I also downloaded some similar motherboards definitions - seems they
>> uploaded it for motherboards with dual core Xeons
>> and such, not for the quadcores. None of those defines 'corrects' the
>> temperature of the quadcore Xeons, they basically kick out
>> readings that are not getting used.
>>
>> Now i bet several clusters/supercomputers had these cpu's. How did
>> you solve this problem with the intel L5420's?
>>
>> Maybe someone still has the lm_sensors script lying around somewhere
>> fixing it for the intel Xeons?
>>
>> Thanks in advance,
>> Vincent
>>
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
> _______________________________________________
> Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to