On Sun, Apr 24, 2022 at 11:32:53PM +0200, Claudio Jeker wrote:
> On Sun, Apr 24, 2022 at 02:30:37PM -0400, Bryan Steele wrote:
> > On Sun, Apr 24, 2022 at 07:06:19PM +0200, Claudio Jeker wrote:
> > > On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> > > excludes Zen APU CPUs) this should show additional temp info. This is
> > > based on info from the Linux k10temp driver.
> > > 
> > > Additionally use the MSRs defined in "Open-Source Register Reference For
> > > AMD Family 17h Processors" to measure the CPU core frequency.
> > > That should be the actuall speed of the CPU core during the measuring
> > > interval.

Indeed.

Intel have aperf and mperf (and pperf and smi_count) MSRs too, so
I would argue that an AMD specific driver is the wrong place for this.

> > > 
> > > On my T14g2 the output is now for example:
> > > ksmn0.temp0                       63.88 degC          Tctl
> > > ksmn0.frequency0            3553141515.00 Hz          CPU0
> > > ksmn0.frequency1            3549080315.00 Hz          CPU2
> > > ksmn0.frequency2            3552369937.00 Hz          CPU4
> > > ksmn0.frequency3            3546055048.00 Hz          CPU6
> > > ksmn0.frequency4            3546854449.00 Hz          CPU8
> > > ksmn0.frequency5            3543869698.00 Hz          CPU10
> > > ksmn0.frequency6            3542551127.00 Hz          CPU12
> > > ksmn0.frequency7            4441623647.00 Hz          CPU14
> > > 
> > > It is intresting to watch turbo kick in and how temp causes the CPU to
> > > throttle.

Yes, I've been exporting it via some quick and dirty kstat code at work
on a bunch of boxes, which in turn gets stored with all the other kstats
and some other metrics we're interested in such as the CPU stats the
kernel collects, and some counters specific programs keep track of and
report along with their own getrusage.

We have one thing in particular that has a fairly constant workload.
When it's the only thing running you see the effective performance
from these MSRs report the clock running at about 1.2GHz, and the
program says it's averaging about 10% CPU. If you recompile something,
effective performance of the system spikes to 2.8GHz and that program
says it's CPU usage halves, but it's doing the same amount of work
via all the other metrics it reports.

Pretty cool.

I don't think this driver is the right place to read the MSRs though,
and I'm not sure the cpu driver is the right place either. There's
a bunch of other MSRs on recent CPUs from both AMD and Intel that
can report power/energy measurements (the Running Average Power
Limit aka RAPL stuff), but on AMD those measurements are at the core and
package level rather than on each thread like our cpu driver attaches
to. From what I remember the Intel RAPL bits report stuff about DRAM and
different bits on the die.

dlg

Reply via email to