On Sat, Sep 03, 2022 at 05:33:01PM -0500, Scott Cheloha wrote: > On Sat, Sep 03, 2022 at 10:37:31PM +1000, Jonathan Gray wrote: > > On Sat, Sep 03, 2022 at 06:52:20AM -0500, Scott Cheloha wrote: > > > > On Sep 3, 2022, at 02:22, Jonathan Gray <j...@jsg.id.au> wrote: > > > > > > > > ???On Fri, Sep 02, 2022 at 06:00:25PM -0500, Scott Cheloha wrote: > > > >> dv@ suggested coming to the list to request testing for the pvclock(4) > > > >> driver. Attached is a patch that corrects several bugs. Most of > > > >> these changes will only matter in the non-TSC_STABLE case on a > > > >> multiprocessor VM. > > > >> > > > >> Ideally, nothing should break. > > > >> > > > >> - pvclock yields a 64-bit value. The BSD timecounter layer can only > > > >> use the lower 32 bits, but internally we need to track the full > > > >> 64-bit value to allow comparisons with the full value in the > > > >> non-TSC_STABLE case. So make pvclock_lastcount a 64-bit quantity. > > > >> > > > >> - In pvclock_get_timecount(), move rdtsc() up into the lockless read > > > >> loop to get a more accurate timestamp. > > > >> > > > >> - In pvclock_get_timecount(), use rdtsc_lfence(), not rdtsc(). > > > >> > > > >> - In pvclock_get_timecount(), check that our TSC value doesn't predate > > > >> ti->ti_tsc_timestamp, otherwise we will produce an enormous value. > > > >> > > > >> - In pvclock_get_timecount(), update pvclock_lastcount in the > > > >> non-TSC_STABLE case with more care. On amd64 we can do this with an > > > >> atomic_cas_ulong(9) loop because u_long is 64 bits. On i386 we need > > > >> to introduce a mutex to protect our comparison and read/write. > > > > > > > > i386 has cmpxchg8b, no need to disable interrupts > > > > the ifdefs seem excessive > > > > > > How do I make use of CMPXCHG8B on i386 > > > in this context? > > > > > > atomic_cas_ulong(9) is a 32-bit CAS on > > > i386. > > > > static inline uint64_t > > atomic_cas_64(volatile uint64_t *p, uint64_t o, uint64_t n) > > { > > return __sync_val_compare_and_swap(p, o, n); > > } > > > > Or md atomic.h files could have an equivalent. > > Not possible on all 32-bit archs. > > > > > > > > We can't use FP registers in the kernel, no? > > > > What do FP registers have to do with it? > > > > > > > > Am I missing some other avenue? > > > > There is no rdtsc_lfence() on i386. Initial diff doesn't build. > > LFENCE is an SSE2 extension. As is MFENCE. I don't think I can just > drop rdtsc_lfence() into cpufunc.h and proceed without causing some > kind of fault on an older CPU. > > What are my options on a 586-class CPU for forcing RDTSC to complete > before later instructions?
"3.3.2. Serializing Operations After executing certain instructions the Pentium processor serializes instruction execution. This means that any modifications to flags, registers, and memory for previous instructions are completed before the next instruction is fetched and executed. The prefetch queue is flushed as a result of serializing operations. The Pentium processor serializes instruction execution after executing one of the following instructions: Move to Special Register (except CRO), INVD, INVLPG, IRET, IRETD, LGDT, LLDT, LIDT, LTR, WBINVD, CPUID, RSM and WRMSR." from: Pentium Processor User's Manual Volume 1: Pentium Processor Data Book Order Number 241428 http://bitsavers.org/components/intel/pentium/1993_Intel_Pentium_Processor_Users_Manual_Volume_1.pdf So it could be rdtsc ; cpuid. lfence; rdtsc should still be preferred. It could be tested during boot and set a function pointer. Or the codepatch bits could be used. In the specific case of pvclock, can it be assumed that the host has hardware virt and would then have lfence?