CPU freq calibration accuracy

Roger Pau Monné Tue, 18 Jan 2022 04:45:44 -0800

On Thu, Jan 13, 2022 at 02:41:31PM +0100, Jan Beulich wrote:
> Calibration logic assumes that the platform timer (HPET or ACPI PM
> timer) and the TSC are read at about the same time. This assumption may
> not hold when a long latency event (e.g. SMI or NMI) occurs between the
> two reads. Reduce the risk of reading uncorrelated values by doing at
> least four pairs of reads, using the tuple where the delta between the
> enclosing TSC reads was smallest. From the fourth iteration onwards bail
> if the new TSC delta isn't better (smaller) than the best earlier one.


Do you have some measurements as to whether this makes the frequency
detection any more accurate?

> Signed-off-by: Jan Beulich <[email protected]>
> ---
> Obviously (I think) instead of having both read_{hpet,pmtmr}_and_tsc()
> the calibration logic could be folded between HPET and PMTMR, at the

As an intermediate solution you could have a read_counter_and_tsc I
would think?

> expense of a couple more indirect calls (which may not be that much of a
> problem as this is all boot-time only). Really such folding would have
> been possible already before, just that the amount of duplicate code
> hasn't been this large so far. IOW if so desired, then perhaps the
> folding should be a separate prereq patch.

You could even pass a platform_timesource as a parameter to that
generic read counter and TSC helper.

> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -392,9 +392,36 @@ static u64 read_hpet_count(void)
>      return hpet_read32(HPET_COUNTER);
>  }
>  
> +static uint32_t __init read_hpet_and_tsc(uint64_t *tsc)
> +{
> +    uint64_t tsc_prev = *tsc = rdtsc_ordered(), tsc_min = ~0;
> +    uint32_t best = best;
> +    unsigned int i;
> +
> +    for ( i = 0; ; ++i )
> +    {
> +        uint32_t hpet = hpet_read32(HPET_COUNTER);
> +        uint64_t tsc_cur = rdtsc_ordered();
> +        uint64_t tsc_delta = tsc_cur - tsc_prev;
> +
> +        if ( tsc_delta < tsc_min )
> +        {
> +            tsc_min = tsc_delta;
> +            *tsc = tsc_cur;
> +            best = hpet;
> +        }
> +        else if ( i > 2 )
> +            break;
> +
> +        tsc_prev = tsc_cur;

Would it be better to set tsc_prev to the current TSC value here in
order to minimize the window you are measuring? Or else tsc_delta will
also account for the time in the loop code, and there's more
likeliness from being interrupted?

I guess being interrupted four times so that all loops are bad is very
unlikely.

Since this loop could in theory run multiple times, do we need to
check that the counter doesn't overflow? Or else the elapsed counter
value would be miscalculated.

Thanks, Roger.

Re: [PATCH] x86/time: further improve TSC / CPU freq calibration accuracy

Reply via email to