On Mon, Jan 10, 2022 at 08:52:55AM +0100, Jan Beulich wrote:
> On 07.01.2022 12:39, Jan Beulich wrote:
> > --- a/xen/arch/x86/time.c
> > +++ b/xen/arch/x86/time.c
> > @@ -378,8 +378,9 @@ static u64 read_hpet_count(void)
> >  
> >  static int64_t __init init_hpet(struct platform_timesource *pts)
> >  {
> > -    uint64_t hpet_rate, start;
> > +    uint64_t hpet_rate, start, expired;
> >      uint32_t count, target;
> > +unsigned int i;//temp
> >  
> >      if ( hpet_address && strcmp(opt_clocksource, pts->id) &&
> >           cpuidle_using_deep_cstate() )
> > @@ -415,16 +416,35 @@ static int64_t __init init_hpet(struct p
> >  
> >      pts->frequency = hpet_rate;
> >  
> > +for(i = 0; i < 16; ++i) {//temp
> >      count = hpet_read32(HPET_COUNTER);
> >      start = rdtsc_ordered();
> >      target = count + CALIBRATE_VALUE(hpet_rate);
> >      if ( target < count )
> >          while ( hpet_read32(HPET_COUNTER) >= count )
> >              continue;
> > -    while ( hpet_read32(HPET_COUNTER) < target )
> > +    while ( (count = hpet_read32(HPET_COUNTER)) < target )
> >          continue;
> 
> Unlike I first thought but matching my earlier reply, this only reduces
> the likelihood of encountering an issue. In particular, a long-duration
> event ahead of the final HPET read above would be covered, but ...
> 
> > -    return (rdtsc_ordered() - start) * CALIBRATE_FRAC;
> > +    expired = rdtsc_ordered() - start;
> 
> ... such an event occurring between the final HPET read and the TSC
> read would still be an issue. So far I've only been able to think of an
> ugly way to further reduce likelihood for this window, but besides that
> neither being neat nor excluding the possibility altogether, I have to
> point out that we have the same issue in a number of other places:
> Back-to-back reads of platform timer and TSC are assumed to happen
> close together elsewhere as well.

Right, sorry replied to the patch first without reading this.

> Cc-ing other x86 maintainers to see whether they have any helpful
> thoughts ...

I'm not sure there's much we can do, we could maybe count NMIs and
retry if we detect an NMI has happened during calibration, but we
can't do this for SMIs, as I don't think there's a way to get this
information on all hardware we support. The MSR_SMI_COUNT (0x34) is
Intel-only and requires Nehalem or later.

Thanks, Roger.

Reply via email to