On Thu, Mar 25, 2021 at 02:33:43PM -0400, Josh Rickmar wrote:
> On Thu, Mar 25, 2021 at 01:18:04PM -0500, Scott Cheloha wrote:
> > > On Mar 24, 2021, at 8:29 AM, Josh Rickmar <joshrick...@outlook.com> wrote:
> > > 
> > > [...]
> > 
> > Which diff did you apply?  Yasuoka provided two diffs.
> > 
> > In any case, ignore this diff:
> > 
> > > diff --git a/sys/arch/amd64/amd64/tsc.c b/sys/arch/amd64/amd64/tsc.c
> > > index 238a5a068e1..3b951a8b5a3 100644
> > > --- a/sys/arch/amd64/amd64/tsc.c
> > > +++ b/sys/arch/amd64/amd64/tsc.c
> > > @@ -212,7 +212,8 @@ cpu_recalibrate_tsc(struct timecounter *tc)
> > > u_int
> > > tsc_get_timecount(struct timecounter *tc)
> > > {
> > > - return rdtsc_lfence() + curcpu()->ci_tsc_skew;
> > > + //return rdtsc_lfence() + curcpu()->ci_tsc_skew;
> > > + return rdtsc_lfence();
> > > }
> > > 
> > > void
> > 
> > 
> > We don't want to discard the skews, that's wrong.
> > 
> > The reason it "fixes" Yasuoka's problem is because the real skews
> > on the ESXi VMs in question are probably close to zero but our
> > synchronization algorithm is picking huge (wrong) skews due to
> > some other variable interfering with our measurement.
> 
> I had both applied.  As I understood it, the first patch discarding
> the skews was a proposed fix and the second was only debug code to
> dump the skews.  (Unfortunately, I don't know a way to gather this
> information from my device (thinkpad E485), as I don't believe I can
> attach to the serial console to capture the ddb output.)

Given that userland TSC doesn't work on your machine I'm curious what
your skews look like as measured by Yasuoka's patch.  If you want to
gather the data and you don't have serial console access...

First enable kern.allowkmem to permit pstat(8) to extract the data
from kernel memory.  You'll need to edit /etc/rc.securelevel and
reboot.

Next, run this shell script to read the skews out of kernel memory and
write them out to a file, one file per CPU.  The script assumes
doas.conf(5) is written to allow passwordless root access.  If you
don't have this then you can edit the script to remove doas and just
run the script as root, though I do not recommend this.

Last, disable kern.allowkmem and reboot.

Here's the script:

---

#! /bin/sh

set -e

# Make sure this is equal to TSC_SYNC_NTIMES!
nsamples=1000

# Each sample is a signed 32-bit integer (4 bytes).
samplesz=4

# Find the starting address of the skew array in memory.
base=$(doas pstat -d lld tsc_difs | tr -d ':' | cut -d ' ' -f 3)

for cpu in $(jot $(sysctl -n hw.ncpu) 0); do
        # Ignore the BSP.
        if [ $cpu -eq 0 ]; then
                continue
        fi
        # Compute the start of the given CPU's samples in the array.
        start=$((base + cpu * nsamples * samplesz))
        end=$((start + nsamples * samplesz))
        i=$start
        # Print each sample to the file ./cpuN-skew
        out="./cpu$cpu-skew"
        rm -f $out
        while [ $i -lt $end ]; do
                doas pstat -d d $(printf "0x%x" $i) | cut -d ' ' -f 4 >> $out
                i=$((i + samplesz))
        done
done

---

My laptop has eight logical CPUs.  I get this:

$ sh script.sh
$ wc -l cpu*
    1000 cpu1-skew
    1000 cpu2-skew
    1000 cpu3-skew
    1000 cpu4-skew
    1000 cpu5-skew
    1000 cpu6-skew
    1000 cpu7-skew
    7000 total

ministat yields this:

x cpu1-skew
+ cpu2-skew
* cpu3-skew
% cpu4-skew
# cpu5-skew
@ cpu6-skew
O cpu7-skew
    N           Min           Max        Median           Avg        Stddev
x 1000           -12             7             0        -0.701       2.93568
+ 1000           -12             7             0        -0.476     2.8927136
* 1000           -12             3            -3        -3.251     1.8898599
% 1000           -20            15            -3        -4.235     6.0109822
# 1000           -23             9            -1        -1.482      3.630545
@ 1000           -18             8             1        -0.563     4.7706173
O 1000           -19             6            -6        -5.527     3.6647922

Feel free to share your raw data.

> Clock doesn't go backwards with only the second diff.

Good.

Reply via email to