Re: [PATCH] x86: retrieve and log CPU frequency information

Roger Pau Monné Thu, 14 May 2020 08:35:03 -0700

On Thu, May 14, 2020 at 03:38:18PM +0200, Jan Beulich wrote:
> On 14.05.2020 15:10, Roger Pau Monné wrote:
> > On Wed, Apr 15, 2020 at 01:55:24PM +0200, Jan Beulich wrote:
> >> While from just a single Skylake system it is already clear that we
> >> can't base any of our logic on CPUID leaf 15 [1] (leaf 16 is
> >> documented to be used for display purposes only anyway), logging this
> >> information may still give us some reference in case of problems as well
> >> as for future work. Additionally on the AMD side it is unclear whether
> >> the deviation between reported and measured frequencies is because of us
> >> not doing well, or because of nominal and actual frequencies being quite
> >> far apart.
> > 
> > Can you add some reference to the AMD implementation? I've looked at
> > the PMs and haven't been able to find a description of some of the
> > MSRs, like 0xC0010064.
> 
> Take a look at
> 
> https://developer.amd.com/resources/developer-guides-manuals/
> 
> I'm unconvinced a reference needs adding here.


Do you think it would be sensible to introduce some defines for at
least 0xC0010064? (ie: MSR_AMD_PSTATE_DEF_BASE)

I think it would make it easier to find on the manuals.

> 
> >> --- a/xen/arch/x86/cpu/intel.c
> >> +++ b/xen/arch/x86/cpu/intel.c
> >> @@ -378,6 +378,72 @@ static void init_intel(struct cpuinfo_x8
> >>         ( c->cpuid_level >= 0x00000006 ) &&
> >>         ( cpuid_eax(0x00000006) & (1u<<2) ) )
> >>            __set_bit(X86_FEATURE_ARAT, c->x86_capability);
> >> +
> > 
> > I would split this into a separate helper, ie: intel_log_freq. That
> > will allow you to exit early and reduce some of the indentation IMO.
> 
> Can do; splitting this for AMD/Hygon however was merely to
> facilitate using it for both vendors, though.
> 
> >> +    if ( (opt_cpu_info && !(c->apicid & (c->x86_num_siblings - 1))) ||
> >> +         c == &boot_cpu_data )
> >> +    {
> >> +        unsigned int eax, ebx, ecx, edx;
> >> +        uint64_t msrval;
> >> +
> >> +        if ( c->cpuid_level >= 0x15 )
> >> +        {
> >> +            cpuid(0x15, &eax, &ebx, &ecx, &edx);
> >> +            if ( ecx && ebx && eax )
> >> +            {
> >> +                unsigned long long val = ecx;
> >> +
> >> +                val *= ebx;
> >> +                do_div(val, eax);
> >> +                printk("CPU%u: TSC: %uMHz * %u / %u = %LuMHz\n",
> >> +                       smp_processor_id(), ecx, ebx, eax, val);
> >> +            }
> >> +            else if ( ecx | eax | ebx )
> >> +            {
> >> +                printk("CPU%u: TSC:", smp_processor_id());
> >> +                if ( ecx )
> >> +                    printk(" core: %uMHz", ecx);
> >> +                if ( ebx && eax )
> >> +                    printk(" ratio: %u / %u", ebx, eax);
> >> +                printk("\n");
> >> +            }
> >> +        }
> >> +
> >> +        if ( c->cpuid_level >= 0x16 )
> >> +        {
> >> +            cpuid(0x16, &eax, &ebx, &ecx, &edx);
> >> +            if ( ecx | eax | ebx )
> >> +            {
> >> +                printk("CPU%u:", smp_processor_id());
> >> +                if ( ecx )
> >> +                    printk(" bus: %uMHz", ecx);
> >> +                if ( eax )
> >> +                    printk(" base: %uMHz", eax);
> >> +                if ( ebx )
> >> +                    printk(" max: %uMHz", ebx);
> >> +                printk("\n");
> >> +            }
> >> +        }
> >> +
> >> +        if ( !rdmsr_safe(MSR_INTEL_PLATFORM_INFO, msrval) &&
> >> +             (uint8_t)(msrval >> 8) )
> > 
> > I would introduce a mask for it would be cleaner, since you use it
> > here and below (and would avoid the casting to uint8_t.
> 
> To avoid the casts (also below) I could introduce local variables.
> I specifically wanted to avoid MASK_EXTR() such that the rest of the
> calculations in
> 
>             if ( (uint8_t)(msrval >> 40) )
>                 printk("%u..", (factor * (uint8_t)(msrval >> 40) + 50) / 100);
>             printk("%u MHz\n", (factor * (uint8_t)(msrval >> 8) + 50) / 100);
> 
> can be done as 32-bit arithmetic.

Might be cleaner with the local variables.

> >> +        {
> >> +            unsigned int factor = 10000;
> >> +
> >> +            if ( c->x86 == 6 )
> >> +                switch ( c->x86_model )
> >> +                {
> >> +                case 0x1a: case 0x1e: case 0x1f: case 0x2e: /* Nehalem */
> >> +                case 0x25: case 0x2c: case 0x2f: /* Westmere */
> >> +                    factor = 13333;
> > 
> > The SDM lists ratio * 100MHz without any notes, why are those models
> > different, is this some errata?
> 
> Did you go through the MSR lists for the various models? It's there
> where I found this anomaly, not in any spec updates.

My bad, I was looking at the Atom table I think, and didn't realize
they where multiple tables instead of a single table with different
notes for models.

> 
> >> +                    break;
> >> +                }
> >> +
> >> +            printk("CPU%u: ", smp_processor_id());
> >> +            if ( (uint8_t)(msrval >> 40) )
> >> +                printk("%u..", (factor * (uint8_t)(msrval >> 40) + 50) / 
> >> 100);
> >> +            printk("%u MHz\n", (factor * (uint8_t)(msrval >> 8) + 50) / 
> >> 100);
> > 
> > Since you are calculating using Hz, should you use an unsigned long
> > factor to prevent capping at 4GHz?
> 
> Hmm, the calculation looks to be in units of 10kHz, until the division
> by 100. I don't think we'd cap at 4GHz this way.

Oh yes, sorry, it's kHz, not Hz.

> 
> >> --- a/xen/include/asm-x86/msr.h
> >> +++ b/xen/include/asm-x86/msr.h
> >> @@ -40,8 +40,8 @@ static inline void wrmsrl(unsigned int m
> >>  
> >>  /* rdmsr with exception handling */
> >>  #define rdmsr_safe(msr,val) ({\
> >> -    int _rc; \
> >> -    uint32_t lo, hi; \
> >> +    int rc_; \
> >> +    uint32_t lo_, hi_; \
> >>      __asm__ __volatile__( \
> >>          "1: rdmsr\n2:\n" \
> >>          ".section .fixup,\"ax\"\n" \
> >> @@ -49,15 +49,15 @@ static inline void wrmsrl(unsigned int m
> >>          "   movl %5,%2\n; jmp 2b\n" \
> >>          ".previous\n" \
> >>          _ASM_EXTABLE(1b, 3b) \
> >> -        : "=a" (lo), "=d" (hi), "=&r" (_rc) \
> >> +        : "=a" (lo_), "=d" (hi_), "=&r" (rc_) \
> >>          : "c" (msr), "2" (0), "i" (-EFAULT)); \
> >> -    val = lo | ((uint64_t)hi << 32); \
> >> -    _rc; })
> >> +    val = lo_ | ((uint64_t)hi_ << 32); \
> >> +    rc_; })
> > 
> > Since you are changing the local variable names, I would just switch
> > rdmsr_safe to a static inline, and drop the underlines. I don't see a
> > reason this has to stay as a macro.
> 
> Well, all callers would need to be changed to pass the address of
> the variable to store the value read into. That's quite a bit of
> code churn, and hence nothing I'd want to do in this patch.

Oh, right, didn't realize it's a macro for that reason.

Thanks, Roger.

Re: [PATCH] x86: retrieve and log CPU frequency information

Reply via email to