On 7/1/2025 8:12 PM, Alexandre Chartre wrote:
On 7/1/25 13:12, Xiaoyao Li wrote:
On 7/1/2025 6:26 PM, Zhao Liu wrote:
unless it was explicitly requested by the user.
But this could still break Windows, just like issue #3001, which enables
arch-capabilities for EPYC-Genoa. This fact shows that even explicitly
turning on arch-capabilities in AMD Guest and utilizing KVM's emulated
value would even break something.
So even for named CPUs, arch-capabilities=on doesn't reflect the fact
that it is purely emulated, and is (maybe?) harmful.
It is because Windows adds wrong code. So it breaks itself and it's
just the regression of Windows.
KVM and QEMU are not supposed to be blamed.
I can understand the Windows code logic, and I don't think it is
necessarily wrong,
because it finds that the system has:
- an AMD cpu
- an Intel-only feature/MSR
Then what should the code do? Trust the cpu type (AMD) or trust the MSR
(Intel).
They decided not to choose, and for safety they stop because they have
an unexpected
configuration.
It's not how software/OS is supposed to work with x86 architecture.
Though there are different vendors for x86, like Intel and AMD, they
both implement x86 architecture. For x86 architecture, architectural
features are enumerated by CPUID. If you read Intel SDM and AMD APM, you
will find that Intel defines most features at range [0, x] while AMD
defines most features at range [0x8000 000, 0x8000 000y]. But if a bit
is defined by both Intel and AMD, it must have same meaning and
enumerate the same feature.
Usually, a feature is first introduced by one vendor, then other vendors
might implement the same one later. E.g., bus lock detection, which is
enumerated via CPUID.7_0:ECX[bit 24] and first introduced by Intel in
2020. Later, AMD implemented the same one from Zen 5. Before AMD
implemented it, it was an Intel-only feature. Can we make code as below
if (is_AMD && cpuid_enumerates_bus_lock_detect)
error(unsupported CPU);
at that time? If we wrote such code, then it will fail on all the AMD
Zen 5 CPUs.
Besides, I would like to talk about how software is supposed to deal
with reserved bits on x86 architecture. In general, software should not
set any expectation on the reserved bit. The value cannot be relied upon
to be 0 since any reserved bit can have a meaning in the future. As Igor
said:
software shouldn't even try to use it or make any decisions
based on that
For more information, you can refer to Intel SDM vol1. chapter 1.3.2
Reserved Bits and Software compatibility. For AMD APM, you would need
search yourself.
OK, back to the original question "what should the code do?"
My answer is, it can behave with any of below option:
- Be vendor agnostic and stick to x86 architecture. If CPUID enumerates
a feature, then the feature is available architecturally.
- Based on AMD spec. Ignore the bit since it's a reserved bit. (Expect a
reserved bit to be zero if not explicitly state by spec is totally wrong!)
alex.