On Sun, Sep 04, 2022 at 02:50:10PM +1000, Jonathan Gray wrote:
> On Sat, Sep 03, 2022 at 05:33:01PM -0500, Scott Cheloha wrote:
> > On Sat, Sep 03, 2022 at 10:37:31PM +1000, Jonathan Gray wrote:
> > > On Sat, Sep 03, 2022 at 06:52:20AM -0500, Scott Cheloha wrote:
> > > > > On Sep 3, 2022, at 02:22, Jonathan Gray <j...@jsg.id.au> wrote:
> > > > >
> > > > > ???On Fri, Sep 02, 2022 at 06:00:25PM -0500, Scott Cheloha wrote:
> > > > >> dv@ suggested coming to the list to request testing for the 
> > > > >> pvclock(4)
> > > > >> driver.  Attached is a patch that corrects several bugs.  Most of
> > > > >> these changes will only matter in the non-TSC_STABLE case on a
> > > > >> multiprocessor VM.
> > > > >>
> > > > >> Ideally, nothing should break.
> > > > >>
> > > > >> - pvclock yields a 64-bit value.  The BSD timecounter layer can only
> > > > >>  use the lower 32 bits, but internally we need to track the full
> > > > >>  64-bit value to allow comparisons with the full value in the
> > > > >>  non-TSC_STABLE case.  So make pvclock_lastcount a 64-bit quantity.
> > > > >>
> > > > >> - In pvclock_get_timecount(), move rdtsc() up into the lockless read
> > > > >>  loop to get a more accurate timestamp.
> > > > >>
> > > > >> - In pvclock_get_timecount(), use rdtsc_lfence(), not rdtsc().
> > > > >>
> > > > >> - In pvclock_get_timecount(), check that our TSC value doesn't 
> > > > >> predate
> > > > >>  ti->ti_tsc_timestamp, otherwise we will produce an enormous value.
> > > > >>
> > > > >> - In pvclock_get_timecount(), update pvclock_lastcount in the
> > > > >>  non-TSC_STABLE case with more care.  On amd64 we can do this with an
> > > > >>  atomic_cas_ulong(9) loop because u_long is 64 bits.  On i386 we need
> > > > >>  to introduce a mutex to protect our comparison and read/write.
> > > > >
> > > > > i386 has cmpxchg8b, no need to disable interrupts
> > > > > the ifdefs seem excessive
> > > >
> > > > How do I make use of CMPXCHG8B on i386
> > > > in this context?
> > > >
> > > > atomic_cas_ulong(9) is a 32-bit CAS on
> > > > i386.
> > >
> > > static inline uint64_t
> > > atomic_cas_64(volatile uint64_t *p, uint64_t o, uint64_t n)
> > > {
> > >   return __sync_val_compare_and_swap(p, o, n);
> > > }
> > >
> > > Or md atomic.h files could have an equivalent.
> > > Not possible on all 32-bit archs.
> > >
> > > >
> > > > We can't use FP registers in the kernel, no?
> > >
> > > What do FP registers have to do with it?
> > >
> > > >
> > > > Am I missing some other avenue?
> > >
> > > There is no rdtsc_lfence() on i386.  Initial diff doesn't build.
> >
> > LFENCE is an SSE2 extension.  As is MFENCE.  I don't think I can just
> > drop rdtsc_lfence() into cpufunc.h and proceed without causing some
> > kind of fault on an older CPU.
> >
> > What are my options on a 586-class CPU for forcing RDTSC to complete
> > before later instructions?
>
> "3.3.2. Serializing Operations
> After executing certain instructions the Pentium processor serializes
> instruction execution. This means that any modifications to flags,
> registers, and memory for previous instructions are completed before
> the next instruction is fetched and executed. The prefetch queue
> is flushed as a result of serializing operations.
>
> The Pentium processor serializes instruction execution after executing
> one of the following instructions: Move to Special Register (except
> CRO), INVD, INVLPG, IRET, IRETD, LGDT, LLDT, LIDT, LTR, WBINVD,
> CPUID, RSM and WRMSR."
>
> from:
> Pentium Processor User's Manual
> Volume 1: Pentium Processor Data Book
> Order Number 241428
>
> http://bitsavers.org/components/intel/pentium/1993_Intel_Pentium_Processor_Users_Manual_Volume_1.pdf
>
> So it could be rdtsc ; cpuid.
> lfence; rdtsc should still be preferred.
>
> It could be tested during boot and set a function pointer.
> Or the codepatch bits could be used.
>
> In the specific case of pvclock, can it be assumed that the host
> has hardware virt and would then have lfence?
>

I think this is a fair assumption.

-ml

Reply via email to