Hi, This series tries to unify the three cache models currently in X86CPUState: cache_info_cpuid2, cache_info_cpuid4 and cache_info_amd, into a single cache_info.
Fix, clean up, and simplify the current x86 CPU cache model support. Especially, make the cache infomation in CPUID aligns with the vendor's specifications. QEMU x86 supports four vendors, and the impact of this series is as follows: * AMD: No change. * Hygon (mostly follows AMD): No change. - However, I suspect that Hygon should skip the 0x2 and 0x4 leaves just like AMD. But since this cannot be confirmed for me, I just leave everything unchanged. If necessary, we can fix it. * Intel: - Clarify the use of legacy_l2_cache_cpuid2. And for very older named CPUs ("486", "pentium", "pentium2" and "pentium3") that do not support CPUID 0x4, use the cache model like cache_info_cpuid2. - For other CPUs, use the cache model like cache_info_cpuid4. - CPUID 0x2, 0x4 and 0x80000006 use the consistent cache model. - CPUID 0x80000005 is marked reserved as SDM requires. * Zhaoxin (mostly follows Intel): mostly consistent with Intel's changes, except for CPUID 0x80000005, which follows AMD behavior but can correctly use the cache model consistent with CPUID 0x4. Please note that one significant reason Intel requires so many fixes (which also implies such confusion) is that Intel's named CPUs currently do not have specific cache models and instead use the default legacy cache models. This reflects the importance of adding cache models [1] for named CPUs. Philippe already has the patch [2] to remove "legacy-cache" compat property. I initially intended to base upon his work (which could get some simplification). However, I found that this series and [2] can be well decoupled, making it easier to review and apply, so this series now is based on the master branch at 6e1571533fd9. (Next, I will detail the thought process behind the solution. You can skip to the end of cover letter for a concise "Patch Summary") Thanks for your patience and feedback! Background ========== First of all, this the typical CPUIDs (cache related) from an Intel Guest: CPU 0: ... 0x00000002 0x00: eax=0x00000001 ebx=0x00000000 ecx=0x0000004d edx=0x002c307d * X86CPUState.cache_info_cpuid2: L1 data cache: 32K, 8-way, 64 byte lines L1 instruction cache: 32K, 8-way, 64 byte lines L2 cache: 2M, 8-way, 64 byte lines <--- legacy_l2_cache_cpuid2 L3 cache: 16M, 16-way, 64 byte lines) ... 0x00000004 0x00: eax=0x00000121 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001 0x00000004 0x01: eax=0x00000122 ebx=0x01c0003f ecx=0x0000003f edx=0x00000001 0x00000004 0x02: eax=0x00000143 ebx=0x03c0003f ecx=0x00000fff edx=0x00000001 0x00000004 0x03: eax=0x00000163 ebx=0x03c0003f ecx=0x00003fff edx=0x00000006 0x00000004 0x04: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 * X86CPUState.cache_info_cpuid4: L1 data cache: 32K, 8-way, 64 byte lines L1 instruction cache: 32K, 8-way, 64 byte lines L2 cache: 4M, 16-way, 64 byte lines <--- legacy_l2_cache_cpuid4 L3 cache: 16M, 16-way, 64 byte lines) ... 0x80000006 0x00: eax=0x00000000 ebx=0x42004200 ecx=0x02008140 edx=0x00808140 0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 * X86CPUState.cache_info_amd: L1 data cache: 64K, 2-way, 64 byte lines <--- legacy_l1d_cache_amd L1 instruction cache: 64K, 2-way, 64 byte lines <--- legacy_l1i_cache_amd L2 cache: 512K, 16-way, 64 byte lines <--- legacy_l2_cache_amd L3 cache: 16M, 16-way, 64 byte lines Note: L1 & L3 fields should be reserved for Intel in these 2 leaves. It's quite surprising that an Intel Guest CPU actually includes three different cache models! The reason, as I mentioned at the beginning, is that Intel named CPUs lack the built-in "named" cache model and can only use the legacy cache model. The issues above are caused by having three legacy cache models. Of course, host/max CPUs will also have these issues. Despite the confusion, fortunately, software that follows the SDM will prefer CPUID 0x4. So, no related bug reports have been observed. But this issue has already been noticed for quite some time, like the many "FIXME" notes left by Eduardo: /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */ /*FIXME: CPUID leaf 0x80000005 is inconsistent with leaves 2 & 4 */ /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */ /*FIXME: CPUID leaf 0x80000006 is inconsistent with leaves 2 & 4 */ Solution ======== The most challenging thing to fix this issue, is how to handle compatibility! Among the legacy cache models, the oldest, legacy_l2_cache_cpuid2, was introduced during the Pentium era (2007, for more details, please refer to the commit message of patch 4). Moreover, after then, QEMU has continuously introduced various compat properties, making any change likely to have widespread effects. But eventually, I realized that the most crucial compat property is "x-vendor-cpuid-only". And, the entire cleanup process can be divided into two steps: 1. Merge cache_info_cpuid2 and cache_info_cpuid4 ------------------------------------------------ These 2 cache models are both used for Intel, but one is used in CPUID 0x2 and another is for 0x4. I introduced the x-consistent-cache compat property and, according to the SDM, reworked the encoding of 0x2, marking 0x2 as unavailiable for cache info. This way, only cache_info_cpuid4 is needed. For the older CPUs without 0x4 ("486", "pentium", "pentium2" and "pentium3"), I add a "named" cache model (based on cache_info_cpuid2) and build it into the definition structures of these old CPU models. 2. Merge cache_info_cpuid4 and cache_info_cpuid_amd --------------------------------------------------- Merging these two cache models requires consideration of the following issues: 1) The final unified cache model is based on the vendor. 2) Compatibility with older machines is needed: - x-vendor-cpuid-only=false for PC v6.0 and older. - x-vendor-cpuid-only=true for PC v6.0 to PC v10.0 - and newer). Therefore, I have the following table to reflect the behavior of historical machines: [Table 1: Cache models used in CPUID leaves for different versioned machines] Diagram: C4 = cache_info_cpuid4, CA = cache_info_cpuid_amd * Intel CPU: | x-vendor-cpuid-only=false | x-vendor-cpuid-only=true || ideal (x-vendor-cpuid-only-2=true) | (PC v6.0 and older) | (PC v6.0 to PC v10.0) || (PC v10.1 ~) --------------------------------------------------------------------------------------------------------- 0x2 | C4 | C4 || C4 | | || --------------------------------------------------------------------------------------------------------- 0x4 | C4 | C4 || C4 | | || --------------------------------------------------------------------------------------------------------- 0x80000005 | CA | CA || 0 (Reserved) | | || [Note: "0" <==> "C4"] --------------------------------------------------------------------------------------------------------- 0x80000006 | CA | CA || C4 (eax=ebx=edx=0) | | || [Note: "0" <==> "C4"] --------------------------------------------------------------------------------------------------------- 0x8000001D | - (Unreached) | - (Unreached) || - (Unreached) | [Note: "-" <==> "CA"] | [Note: "-" <==> "CA"] || [Note: "0" <==> "C4"] * AMD CPU: | x-vendor-cpuid-only=false | x-vendor-cpuid-only=true || ideal (x-vendor-cpuid-only-2=true) | (PC v6.0 and older) | (PC v6.0 to PC v10.0) || (PC v10.1 ~) ---------------------------------------------------------------------------------------------------------- 0x2 | C4 | 0 (Reserved) || CA | | [Note: "0" <==> "C4"] || ---------------------------------------------------------------------------------------------------------- 0x4 | C4 | 0 (Reserved) || CA | | [Note: "0" <==> "C4"] || ---------------------------------------------------------------------------------------------------------- 0x80000005 | CA | CA || CA | | || ---------------------------------------------------------------------------------------------------------- 0x80000006 | CA | CA || CA | | || ---------------------------------------------------------------------------------------------------------- 0x8000001D | CA | CA || CA | | || Our final goal is to select between legacy AMD cache model and legacy Intel cache model based on the vendor. At first glance, this table appears very chaotic, seemingly consisting of various unrelated cases, like a somewhat unsightly monster composed of "different vendors", "different CPUID leaves", "different versioned machines", as well as reserved "0" and unreached "-". But brain teaser! * Reserved: If a leaf is reserved, which means whatever the cache models it selects, it always have all-0 registers! Thus, we can It's valid to consider this leaf as choosing either the Intel cache model or the AMD cache model, because the specific values will be ignored. * Unreached: In practice, it's similar to being reserved, although the spec doesn't explicitly state it as reserved. Similarly, choosing any cache model doesn't affect the encoding of the "Unreached" leaf. With this consideration, (and by combining the "Note" in square brackets within the table,) we can replace the "reserved" and "unreached" cases with the specific cache models noted in the annotations. This reveals the underlying pattern: [Table 2: "Refined" cache models used in CPUID leaves for different versioned machines] * Intel CPU: | x-vendor-cpuid-only=false | x-vendor-cpuid-only=true || ideal (x-vendor-cpuid-only-2=true) | (PC v6.0 and older) | (PC v6.0 to PC v10.0) || (PC v10.1 ~) --------------------------------------------------------------------------------------------------------- 0x2 | C4 | C4 || C4 | | || --------------------------------------------------------------------------------------------------------- 0x4 | C4 | C4 || C4 | | || --------------------------------------------------------------------------------------------------------- 0x80000005 | CA | CA || "C4" | | || --------------------------------------------------------------------------------------------------------- 0x80000006 | CA | CA || "C4" | | || --------------------------------------------------------------------------------------------------------- 0x8000001D | "CA" | "CA" || "C4" | | || * AMD CPU: | x-vendor-cpuid-only=false | x-vendor-cpuid-only=true || ideal (x-vendor-cpuid-only-2=true) | (PC v6.0 and older) | (PC v6.0 to PC v10.0) || (PC v10.1 ~) ---------------------------------------------------------------------------------------------------------- 0x2 | C4 | "C4" || CA | | || ---------------------------------------------------------------------------------------------------------- 0x4 | C4 | "C4" || CA | | || ---------------------------------------------------------------------------------------------------------- 0x80000005 | CA | CA || CA | | || ---------------------------------------------------------------------------------------------------------- 0x80000006 | CA | CA || CA | | || ---------------------------------------------------------------------------------------------------------- 0x8000001D | CA | CA || CA | | || Based on Table 2, where the "reserved"/"unreached" fields have been equivalently replaced, we can see that although x-vendor-cpuid-only (since v6.1) affects the specific CPUID leaf encoding, its essence can be regarded as not changing the underlying cache model choice (cache_info_amd vs. cache_info_cpuid4). Therefore, we can confidently propose this solution: * For v10.1 and future, select legacy cache model based Guest CPU's vendor. - Then we can merge cache_info_cpuid4 and cache_info_amd into a single cache_info, but just initialize cache_info based on vendor. * For v10.0 and older: - Use legacy Intel cache model (original cache_info_cpuid4) by default in CPUID 0x2 and 0x4 leaves. - Use legacy AMD cache model (original cache_info_amd) by default in CPUID 0x80000005, 0x80000006 and 0x8000001D. Patch Summary ============= Patch 01-06: Merge cache_info_cpuid2 and cache_info_cpuid4 Patch 07-16: Merge cache_info_cpuid4 and cache_info_amd Note: patch 11-15 they each provide more specific evidence that selecting a legacy cache model based on the Guest vendor in CPUID 0x2, 0x4, 0x80000005, 0x80000006, and 0x8000001D leaves is both valid and safe, and doesn't break compatibility. Reference ========= [1]: https://lore.kernel.org/qemu-devel/20250423114702.1529340-1-zhao1....@intel.com/ [2]: https://lore.kernel.org/qemu-devel/20250501223522.99772-9-phi...@linaro.org/ Thanks and Best Regards, Zhao --- Zhao Liu (16): i386/cpu: Refine comment of CPUID2CacheDescriptorInfo i386/cpu: Add descriptor 0x49 for CPUID 0x2 encoding i386/cpu: Add default cache model for Intel CPUs with level < 4 i386/cpu: Present same cache model in CPUID 0x2 & 0x4 i386/cpu: Consolidate CPUID 0x4 leaf i386/cpu: Drop CPUID 0x2 specific cache info in X86CPUState i386/cpu: Mark CPUID[0x80000005] as reserved for Intel i386/cpu: Fix CPUID[0x80000006] for Intel CPU i386/cpu: Add legacy_intel_cache_info cache model i386/cpu: Add legacy_amd_cache_info cache model i386/cpu: Select legacy cache model based on vendor in CPUID 0x2 i386/cpu: Select legacy cache model based on vendor in CPUID 0x4 i386/cpu: Select legacy cache model based on vendor in CPUID 0x80000005 i386/cpu: Select legacy cache model based on vendor in CPUID 0x80000006 i386/cpu: Select legacy cache model based on vendor in CPUID 0x8000001D i386/cpu: Use a unified cache_info in X86CPUState hw/i386/pc.c | 5 +- target/i386/cpu.c | 543 +++++++++++++++++++++++++++++----------------- target/i386/cpu.h | 25 ++- 3 files changed, 372 insertions(+), 201 deletions(-) -- 2.34.1