On Sun, Aug 14, 2022 at 11:24:37PM -0500, Scott Cheloha wrote:
> 
> In the future when the LAPIC timer is run in oneshot mode there will
> be no lapic_delay().
> 
> [...]
> 
> This is *very* bad for older amd64 machines, because you are left with
> i8254_delay().
> 
> I would like to offer a less awful delay(9) implementation for this
> class of hardware.  Otherwise we may trip over bizarre phantom bugs on
> MP kernels because only one CPU can read the i8254 at a time.
> 
> [...]
> 
> Real i386 hardware should be fine.  Later models with an ACPI PM timer
> will be fine using acpitimer_delay() instead of i8254_delay().
> 
> [...]
> 
> Here are the sample measurements from my 2017 laptop (kaby lake
> refresh) running the attached patch.  It takes longer than a
> microsecond to read either of the ACPI timers.  The PM timer is better
> than the HPET.  The HPET is a bit better than the i8254.  I hope the
> numbers are a little better on older hardware.
> 
> acpitimer_test_delay:  expected  0.000001000  actual  0.000010638  error  
> 0.000009638
> acpitimer_test_delay:  expected  0.000010000  actual  0.000015464  error  
> 0.000005464
> acpitimer_test_delay:  expected  0.000100000  actual  0.000107619  error  
> 0.000007619
> acpitimer_test_delay:  expected  0.001000000  actual  0.001007275  error  
> 0.000007275
> acpitimer_test_delay:  expected  0.010000000  actual  0.010007891  error  
> 0.000007891
> 
> acpihpet_test_delay:   expected  0.000001000  actual  0.000022208  error  
> 0.000021208
> acpihpet_test_delay:   expected  0.000010000  actual  0.000031690  error  
> 0.000021690
> acpihpet_test_delay:   expected  0.000100000  actual  0.000112647  error  
> 0.000012647
> acpihpet_test_delay:   expected  0.001000000  actual  0.001021480  error  
> 0.000021480
> acpihpet_test_delay:   expected  0.010000000  actual  0.010013736  error  
> 0.000013736
> 
> i8254_test_delay:      expected  0.000001000  actual  0.000040110  error  
> 0.000039110
> i8254_test_delay:      expected  0.000010000  actual  0.000039471  error  
> 0.000029471
> i8254_test_delay:      expected  0.000100000  actual  0.000128031  error  
> 0.000028031
> i8254_test_delay:      expected  0.001000000  actual  0.001024586  error  
> 0.000024586
> i8254_test_delay:      expected  0.010000000  actual  0.010021859  error  
> 0.000021859

Attched is an updated patch.  I left the test measurement code in
place because I would like to see a test on a real i386 machine, just
to make sure it works as expected.  I can't imagine why it wouldn't
work, but we should never assume anything.

Changes from v1:

- Actually set delay_func from acpitimerattach() and
  acpihpet_attach().

  I think it's safe to assume, on real hardware, that the ACPI PMT is
  preferable to the i8254 and the HPET is preferable to both of them.

  This is not *always* true, but it is true on the older machines that
  can't use tsc_delay(), so the assumption works in practice.

  Outside of those three timers, the hierarchy gets murky.  There are
  other timers that are better than the HPET, but they aren't always
  available.  If those timers are already providing delay_func this
  code does not usurp them.

- Duplicate test measurement code from amd64/lapic.c into i386/lapic.c.
  Will be removed in the committed version.

- Use bus_space_read_8() in acpihpet.c if it's available.  The HPET is
  a 64-bit counter and the spec permits 32-bit or 64-bit aligned access.

  As one might predict, this cuts the overhead in half because we're
  doing half as many reads.

  This part can go into a separate commit, but I thought it was neat
  so I'm including it here.

One remaining question I have:

Is there a nice way to test whether ACPI PMT support is compiled into
the kernel?  We can assume the existence of i8254_delay() because
clock.c is required on amd64 and i386.  However, acpitimer.c is a
optional, so acpitimer_delay() isn't necessarily there.

I would rather not introduce a hard requirement on acpitimer.c into
acpihpet.c if there's an easy way to check for the latter.

Any ideas?

Anyone have i386 hardware results?  If I'm reading the timeline right,
most P6 machines and beyond (NetBurst, etc) will have an ACPI PMT.  I
don't know if any real x86 motherboards shipped with an HPET, but it's
possible.

Here are my updated results with the bus_space_read_8 change:

acpitimer_test_delay:  expected  0.000001000  actual  0.000010607  error  
0.000009607
acpitimer_test_delay:  expected  0.000010000  actual  0.000015491  error  
0.000005491
acpitimer_test_delay:  expected  0.000100000  actual  0.000107734  error  
0.000007734
acpitimer_test_delay:  expected  0.001000000  actual  0.001008006  error  
0.000008006
acpitimer_test_delay:  expected  0.010000000  actual  0.010007042  error  
0.000007042

acpihpet_test_delay:   expected  0.000001000  actual  0.000013282  error  
0.000012282
acpihpet_test_delay:   expected  0.000010000  actual  0.000022743  error  
0.000012743
acpihpet_test_delay:   expected  0.000100000  actual  0.000109826  error  
0.000009826
acpihpet_test_delay:   expected  0.001000000  actual  0.001012149  error  
0.000012149
acpihpet_test_delay:   expected  0.010000000  actual  0.010010841  error  
0.000010841

i8254_test_delay:      expected  0.000001000  actual  0.000039767  error  
0.000038767
i8254_test_delay:      expected  0.000010000  actual  0.000039490  error  
0.000029490
i8254_test_delay:      expected  0.000100000  actual  0.000127800  error  
0.000027800
i8254_test_delay:      expected  0.001000000  actual  0.001023940  error  
0.000023940
i8254_test_delay:      expected  0.010000000  actual  0.010032127  error  
0.000032127

And the patch:

Index: arch/amd64/amd64/lapic.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/lapic.c,v
retrieving revision 1.60
diff -u -p -r1.60 lapic.c
--- arch/amd64/amd64/lapic.c    15 Aug 2022 04:17:50 -0000      1.60
+++ arch/amd64/amd64/lapic.c    16 Aug 2022 16:09:56 -0000
@@ -466,6 +466,19 @@ lapic_initclocks(void)
        lapic_startclock();
 
        i8254_inittimecounter_simple();
+
+       extern void acpitimer_test_delay(int);
+       extern void acpihpet_test_delay(int);
+       extern void i8254_test_delay(int);
+       int usec[] = { 1, 10, 100, 1000, 10000 };
+       size_t i;
+       delay(20000);   /* wait for real timecounter to activate */
+       for (i = 0; i < nitems(usec); i++)
+               acpitimer_test_delay(usec[i]);
+       for (i = 0; i < nitems(usec); i++)
+               acpihpet_test_delay(usec[i]);
+       for (i = 0; i < nitems(usec); i++)
+               i8254_test_delay(usec[i]);
 }
 
 
Index: arch/amd64/isa/clock.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/isa/clock.c,v
retrieving revision 1.36
diff -u -p -r1.36 clock.c
--- arch/amd64/isa/clock.c      13 Feb 2022 19:15:09 -0000      1.36
+++ arch/amd64/isa/clock.c      16 Aug 2022 16:09:56 -0000
@@ -266,6 +266,22 @@ i8254_delay(int n)
 }
 
 void
+i8254_test_delay(int usecs)
+{
+       struct timespec ac, er, ex, t0, t1;
+
+       nanouptime(&t0);
+       i8254_delay(usecs);
+       nanouptime(&t1);
+       timespecsub(&t1, &t0, &ac);
+       NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex);
+       timespecsub(&ac, &ex, &er);
+       printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n",
+           __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec,
+           er.tv_sec, er.tv_nsec);
+}
+
+void
 rtcdrain(void *v)
 {
        struct timeout *to = (struct timeout *)v;
Index: arch/i386/i386/lapic.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/i386/lapic.c,v
retrieving revision 1.49
diff -u -p -r1.49 lapic.c
--- arch/i386/i386/lapic.c      15 Aug 2022 04:17:50 -0000      1.49
+++ arch/i386/i386/lapic.c      16 Aug 2022 16:09:56 -0000
@@ -281,6 +281,19 @@ lapic_initclocks(void)
        lapic_startclock();
 
        i8254_inittimecounter_simple();
+
+       extern void acpitimer_test_delay(int);
+       extern void acpihpet_test_delay(int);
+       extern void i8254_test_delay(int);
+       int usec[] = { 1, 10, 100, 1000, 10000 };
+       size_t i;
+       delay(20000);   /* wait for real timecounter to activate */
+       for (i = 0; i < nitems(usec); i++)
+               acpitimer_test_delay(usec[i]);
+       for (i = 0; i < nitems(usec); i++)
+               acpihpet_test_delay(usec[i]);
+       for (i = 0; i < nitems(usec); i++)
+               i8254_test_delay(usec[i]);
 }
 
 extern int gettick(void);      /* XXX put in header file */
Index: arch/i386/isa/clock.c
===================================================================
RCS file: /cvs/src/sys/arch/i386/isa/clock.c,v
retrieving revision 1.60
diff -u -p -r1.60 clock.c
--- arch/i386/isa/clock.c       23 Feb 2021 04:44:30 -0000      1.60
+++ arch/i386/isa/clock.c       16 Aug 2022 16:09:56 -0000
@@ -375,6 +375,22 @@ i8254_delay(int n)
        }
 }
 
+void
+i8254_test_delay(int usecs)
+{
+       struct timespec ac, er, ex, t0, t1;
+
+       nanouptime(&t0);
+       i8254_delay(usecs);
+       nanouptime(&t1);
+       timespecsub(&t1, &t0, &ac);
+       NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex);
+       timespecsub(&ac, &ex, &er);
+       printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n",
+           __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec,
+           er.tv_sec, er.tv_nsec);
+}
+
 int
 calibrate_cyclecounter_ctr(void)
 {
Index: dev/acpi/acpitimer.c
===================================================================
RCS file: /cvs/src/sys/dev/acpi/acpitimer.c,v
retrieving revision 1.15
diff -u -p -r1.15 acpitimer.c
--- dev/acpi/acpitimer.c        6 Apr 2022 18:59:27 -0000       1.15
+++ dev/acpi/acpitimer.c        16 Aug 2022 16:09:56 -0000
@@ -18,6 +18,7 @@
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/device.h>
+#include <sys/stdint.h>
 #include <sys/timetc.h>
 
 #include <machine/bus.h>
@@ -25,10 +26,13 @@
 #include <dev/acpi/acpireg.h>
 #include <dev/acpi/acpivar.h>
 
+struct acpitimer_softc;
+
 int acpitimermatch(struct device *, void *, void *);
 void acpitimerattach(struct device *, struct device *, void *);
-
+void acpitimer_delay(int);
 u_int acpi_get_timecount(struct timecounter *tc);
+uint32_t acpitimer_read(struct acpitimer_softc *);
 
 static struct timecounter acpi_timecounter = {
        .tc_get_timecount = acpi_get_timecount,
@@ -56,6 +60,8 @@ struct cfdriver acpitimer_cd = {
        NULL, "acpitimer", DV_DULL
 };
 
+int acpitimer_attached;
+
 int
 acpitimermatch(struct device *parent, void *match, void *aux)
 {
@@ -98,18 +104,46 @@ acpitimerattach(struct device *parent, s
        acpi_timecounter.tc_priv = sc;
        acpi_timecounter.tc_name = sc->sc_dev.dv_xname;
        tc_init(&acpi_timecounter);
+
+#if defined(__amd64__) || defined(__i386__)
+       if (delay_func == i8254_delay)
+               delay_func = acpitimer_delay;
+#endif
 #if defined(__amd64__)
        extern void cpu_recalibrate_tsc(struct timecounter *);
        cpu_recalibrate_tsc(&acpi_timecounter);
 #endif
+       acpitimer_attached = 1;
 }
 
+void
+acpitimer_delay(int usecs)
+{
+       uint64_t count = 0, cycles;
+       struct acpitimer_softc *sc = acpi_timecounter.tc_priv;
+       uint32_t mask = acpi_timecounter.tc_counter_mask;
+       uint32_t val1, val2;
+
+       val2 = acpitimer_read(sc);
+       cycles = usecs * acpi_timecounter.tc_frequency / 1000000;
+       while (count < cycles) {
+               CPU_BUSY_CYCLE();
+               val1 = val2;
+               val2 = acpitimer_read(sc);
+               count += (val2 - val1) & mask;
+       }
+}
 
 u_int
 acpi_get_timecount(struct timecounter *tc)
 {
-       struct acpitimer_softc *sc = tc->tc_priv;
-       u_int u1, u2, u3;
+       return acpitimer_read(tc->tc_priv);
+}
+
+uint32_t
+acpitimer_read(struct acpitimer_softc *sc)
+{
+       uint32_t u1, u2, u3;
 
        u2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, 0);
        u3 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, 0);
@@ -120,4 +154,25 @@ acpi_get_timecount(struct timecounter *t
        } while (u1 > u2 || u2 > u3);
 
        return (u2);
+}
+
+void
+acpitimer_test_delay(int usecs)
+{
+       struct timespec ac, er, ex, t0, t1;
+
+       if (!acpitimer_attached) {
+               printf("%s: (no pmt attached)\n", __func__);
+               return;
+       }
+
+       nanouptime(&t0);
+       acpitimer_delay(usecs);
+       nanouptime(&t1);
+       timespecsub(&t1, &t0, &ac);
+       NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex);
+       timespecsub(&ac, &ex, &er);
+       printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n",
+           __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec,
+           er.tv_sec, er.tv_nsec);
 }
Index: dev/acpi/acpihpet.c
===================================================================
RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
retrieving revision 1.26
diff -u -p -r1.26 acpihpet.c
--- dev/acpi/acpihpet.c 6 Apr 2022 18:59:27 -0000       1.26
+++ dev/acpi/acpihpet.c 16 Aug 2022 16:09:56 -0000
@@ -18,6 +18,7 @@
 #include <sys/param.h>
 #include <sys/systm.h>
 #include <sys/device.h>
+#include <sys/stdint.h>
 #include <sys/timetc.h>
 
 #include <machine/bus.h>
@@ -31,7 +32,7 @@ int acpihpet_attached;
 int acpihpet_match(struct device *, void *, void *);
 void acpihpet_attach(struct device *, struct device *, void *);
 int acpihpet_activate(struct device *, int);
-
+void acpihpet_delay(int);
 u_int acpihpet_gettime(struct timecounter *tc);
 
 uint64_t       acpihpet_r(bus_space_tag_t _iot, bus_space_handle_t _ioh,
@@ -84,20 +85,28 @@ struct cfdriver acpihpet_cd = {
 uint64_t
 acpihpet_r(bus_space_tag_t iot, bus_space_handle_t ioh, bus_size_t ioa)
 {
+#ifdef bus_space_read_8
+       return bus_space_read_8(iot, ioh, ioa);
+#else
        uint64_t val;
 
        val = bus_space_read_4(iot, ioh, ioa + 4);
        val = val << 32;
        val |= bus_space_read_4(iot, ioh, ioa);
        return (val);
+#endif
 }
 
 void
 acpihpet_w(bus_space_tag_t iot, bus_space_handle_t ioh, bus_size_t ioa,
     uint64_t val)
 {
+#ifdef bus_space_write_8
+       bus_space_write_8(iot, ioh, ioa, val);
+#else
        bus_space_write_4(iot, ioh, ioa + 4, val >> 32);
        bus_space_write_4(iot, ioh, ioa, val & 0xffffffff);
+#endif
 }
 
 int
@@ -262,10 +271,20 @@ acpihpet_attach(struct device *parent, s
        freq = 1000000000000000ull / period;
        printf(": %lld Hz\n", freq);
 
-       hpet_timecounter.tc_frequency = (uint32_t)freq;
+       hpet_timecounter.tc_frequency = freq;
        hpet_timecounter.tc_priv = sc;
        hpet_timecounter.tc_name = sc->sc_dev.dv_xname;
        tc_init(&hpet_timecounter);
+
+#if defined(__amd64__) || defined(__i386__)
+       if (delay_func == i8254_delay)
+               delay_func = acpihpet_delay;
+       /* XXX what if the kernel has no acpitimer support? */
+       extern void acpitimer_delay(int);
+       if (delay_func == acpitimer_delay)
+               delay_func = acpihpet_delay;
+#endif
+
 #if defined(__amd64__)
        extern void cpu_recalibrate_tsc(struct timecounter *);
        cpu_recalibrate_tsc(&hpet_timecounter);
@@ -273,10 +292,43 @@ acpihpet_attach(struct device *parent, s
        acpihpet_attached++;
 }
 
+void
+acpihpet_delay(int usecs)
+{
+       uint64_t c, s;
+       struct acpihpet_softc *sc = hpet_timecounter.tc_priv;
+
+       s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
+       c = usecs * hpet_timecounter.tc_frequency / 1000000;
+       while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c)
+               CPU_BUSY_CYCLE();
+}
+
 u_int
 acpihpet_gettime(struct timecounter *tc)
 {
        struct acpihpet_softc *sc = tc->tc_priv;
 
        return (bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER));
+}
+
+void
+acpihpet_test_delay(int usecs)
+{
+       struct timespec ac, er, ex, t0, t1;
+
+       if (!acpihpet_attached) {
+               printf("%s: (no hpet attached)\n", __func__);
+               return;
+       }
+
+       nanouptime(&t0);
+       acpihpet_delay(usecs);
+       nanouptime(&t1);
+       timespecsub(&t1, &t0, &ac);
+       NSEC_TO_TIMESPEC(usecs * 1000ULL, &ex);
+       timespecsub(&ac, &ex, &er);
+       printf("%s: expected %lld.%09ld actual %lld.%09ld error %lld.%09ld\n",
+           __func__, ex.tv_sec, ex.tv_nsec, ac.tv_sec, ac.tv_nsec,
+           er.tv_sec, er.tv_nsec);
 }

Reply via email to