On Thu, Mar 25, 2021 at 02:33:43PM -0400, Josh Rickmar wrote: > On Thu, Mar 25, 2021 at 01:18:04PM -0500, Scott Cheloha wrote: > > > On Mar 24, 2021, at 8:29 AM, Josh Rickmar <joshrick...@outlook.com> wrote: > > > > > > [...] > > > > Which diff did you apply? Yasuoka provided two diffs. > > > > In any case, ignore this diff: > > > > > diff --git a/sys/arch/amd64/amd64/tsc.c b/sys/arch/amd64/amd64/tsc.c > > > index 238a5a068e1..3b951a8b5a3 100644 > > > --- a/sys/arch/amd64/amd64/tsc.c > > > +++ b/sys/arch/amd64/amd64/tsc.c > > > @@ -212,7 +212,8 @@ cpu_recalibrate_tsc(struct timecounter *tc) > > > u_int > > > tsc_get_timecount(struct timecounter *tc) > > > { > > > - return rdtsc_lfence() + curcpu()->ci_tsc_skew; > > > + //return rdtsc_lfence() + curcpu()->ci_tsc_skew; > > > + return rdtsc_lfence(); > > > } > > > > > > void > > > > > > We don't want to discard the skews, that's wrong. > > > > The reason it "fixes" Yasuoka's problem is because the real skews > > on the ESXi VMs in question are probably close to zero but our > > synchronization algorithm is picking huge (wrong) skews due to > > some other variable interfering with our measurement. > > I had both applied. As I understood it, the first patch discarding > the skews was a proposed fix and the second was only debug code to > dump the skews. (Unfortunately, I don't know a way to gather this > information from my device (thinkpad E485), as I don't believe I can > attach to the serial console to capture the ddb output.)
Given that userland TSC doesn't work on your machine I'm curious what your skews look like as measured by Yasuoka's patch. If you want to gather the data and you don't have serial console access... First enable kern.allowkmem to permit pstat(8) to extract the data from kernel memory. You'll need to edit /etc/rc.securelevel and reboot. Next, run this shell script to read the skews out of kernel memory and write them out to a file, one file per CPU. The script assumes doas.conf(5) is written to allow passwordless root access. If you don't have this then you can edit the script to remove doas and just run the script as root, though I do not recommend this. Last, disable kern.allowkmem and reboot. Here's the script: --- #! /bin/sh set -e # Make sure this is equal to TSC_SYNC_NTIMES! nsamples=1000 # Each sample is a signed 32-bit integer (4 bytes). samplesz=4 # Find the starting address of the skew array in memory. base=$(doas pstat -d lld tsc_difs | tr -d ':' | cut -d ' ' -f 3) for cpu in $(jot $(sysctl -n hw.ncpu) 0); do # Ignore the BSP. if [ $cpu -eq 0 ]; then continue fi # Compute the start of the given CPU's samples in the array. start=$((base + cpu * nsamples * samplesz)) end=$((start + nsamples * samplesz)) i=$start # Print each sample to the file ./cpuN-skew out="./cpu$cpu-skew" rm -f $out while [ $i -lt $end ]; do doas pstat -d d $(printf "0x%x" $i) | cut -d ' ' -f 4 >> $out i=$((i + samplesz)) done done --- My laptop has eight logical CPUs. I get this: $ sh script.sh $ wc -l cpu* 1000 cpu1-skew 1000 cpu2-skew 1000 cpu3-skew 1000 cpu4-skew 1000 cpu5-skew 1000 cpu6-skew 1000 cpu7-skew 7000 total ministat yields this: x cpu1-skew + cpu2-skew * cpu3-skew % cpu4-skew # cpu5-skew @ cpu6-skew O cpu7-skew N Min Max Median Avg Stddev x 1000 -12 7 0 -0.701 2.93568 + 1000 -12 7 0 -0.476 2.8927136 * 1000 -12 3 -3 -3.251 1.8898599 % 1000 -20 15 -3 -4.235 6.0109822 # 1000 -23 9 -1 -1.482 3.630545 @ 1000 -18 8 1 -0.563 4.7706173 O 1000 -19 6 -6 -5.527 3.6647922 Feel free to share your raw data. > Clock doesn't go backwards with only the second diff. Good.