On 12/22/21 13:42, Andriy Gapon wrote:
There have been some reports on strange / unexpected things with Ryzen
5xxx processors. I think I have seen 5950X, 5900X and 5800X mentioned,
not sure about others.
Since I have 5800X myself I looked into a couple of issues that have
straightforward demonstrators. I would like to share my findings and
observations on those issues.
Issue 1. High wake-up latency for CPU idle states.
This seems to be related to the so called CC6 idle state.
The official information on it is very sparse.
The state is not explicitly exposed to the OS, at least, though ACPI
interfaces that FreeBSD currently supports.
In my tests I see that if all logical processors enter an idle state
then an external interrupt can be delayed by 500+ us. Specifically, I
observed this with an MSI-X interrupt from a discrete network chip.
Interrupts from internal components seem to be affected as well, but to
a lesser degree.
The deep state in question can be entered regardless of whether C2 (via
I/O) is enabled, C1 (via hlt) is sufficient. In fact, with
machdep.idle=hlt it works the same.
The state is not entered if at least one logical CPU is not idle.
The state is not entered if machdep.idle=mwait is used. Apparently, the
processors do not attempt to automatically enter as deep idle modes with
mwait as they do with hlt.
Finally, the state is not entered if zenstates.py utility is used to
disable C6 / CC6 state via an undocumented (publicly) MSR.
For me personally that state does not cause any annoyances but anyone
who experiences problems related to "stuttering", "jitter", latency
might want to look into this.
Issue 2. Uneven performance of CPU intensive tasks, especially with
SCHED_ULE, when SMT is enabled.
I found out that at least on my hardware all even numbered logical CPUs
can perform much better than odd numbered logical CPUs. It seems that
hardware threads within a core are not equal. Maybe this is related to
ability to use boosted frequencies, but maybe something else, I am not
sure.
From a brief look at the ULE code it looks that the selection of a hw
thread within a core is intentionally random when all other things are
equal.
I suspect that the hardware + firmware may actually describe that
performance disparity via ACPI CPPC (_CPC object, etc), but right now we
do not support querying that or making use of it.
It would interesting to see if other owners of similar processors can
confirm or provide counter-examples to my observations.
Simple tests for issue 1:
- ping a host attached to the same switch (so, with very low expected
latency)
- ping 127.0.0.1
For issue 2: take some CPU intensive single-threaded task and bind it
(with cpuset -l) to different logical CPUs. Multiple such tasks can be
run concurrently on different logical CPUs.
References:
-
https://forums.freebsd.org/threads/variable-ping-latency-on-ryzen-setup.82791/
- https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=256594
- https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254040
- https://github.com/r4m0n/ZenStates-Linux
- https://github.com/meowthink/ZenStates-FreeBSD -- has a bug
- https://github.com/avg-I/ZenStates-FreeBSD -- has a fix
- https://www.kernel.org/doc/html/latest/admin-guide/acpi/cppc_sysfs.html
- https://static.linaro.org/connect/lvc21/presentations/lvc21-219.pdf
-
https://uefi.org/specs/ACPI/6.4/14_Platform_Communications_Channel/Platform_Comm_Channel.html
Hi,
I've seen exactly the same thing. See older FreeBSD-current thread:
"AMD Ryzen 5 3400G with Radeon Vega Graphics"
I just put:
machdep.idle=spin
In /boot/loader.conf for now.
--HPS