On 6/26/25 4:31 PM, Zhao Liu wrote:


From: Ewan Hai <ewanhai...@zhaoxin.com>

Add the cache model to YongFeng (v3) to better emulate its
environment.

Note, although YongFeng v2 was added after v10.0, it was also back
ported to v10.0.2. Therefore, the new version (v3) is needed to avoid
conflict.

The cache model is as follows:

       --- cache 0 ---
       cache type                         = data cache (1)
       cache level                        = 0x1 (1)
       self-initializing cache level      = true
       fully associative cache            = false
       maximum IDs for CPUs sharing cache = 0x0 (0)
       maximum IDs for cores in pkg       = 0x0 (0)
       system coherency line size         = 0x40 (64)
       physical line partitions           = 0x1 (1)
       ways of associativity              = 0x8 (8)
       number of sets                     = 0x40 (64)
       WBINVD/INVD acts on lower caches   = false
       inclusive to lower caches          = false
       complex cache indexing             = false
       number of sets (s)                 = 64
       (size synth)                       = 32768 (32 KB)
       --- cache 1 ---
       cache type                         = instruction cache (2)
       cache level                        = 0x1 (1)
       self-initializing cache level      = true
       fully associative cache            = false
       maximum IDs for CPUs sharing cache = 0x0 (0)
       maximum IDs for cores in pkg       = 0x0 (0)
       system coherency line size         = 0x40 (64)
       physical line partitions           = 0x1 (1)
       ways of associativity              = 0x10 (16)
       number of sets                     = 0x40 (64)
       WBINVD/INVD acts on lower caches   = false
       inclusive to lower caches          = false
       complex cache indexing             = false
       number of sets (s)                 = 64
       (size synth)                       = 65536 (64 KB)
       --- cache 2 ---
       cache type                         = unified cache (3)
       cache level                        = 0x2 (2)
       self-initializing cache level      = true
       fully associative cache            = false
       maximum IDs for CPUs sharing cache = 0x0 (0)
       maximum IDs for cores in pkg       = 0x0 (0)
       system coherency line size         = 0x40 (64)
       physical line partitions           = 0x1 (1)
       ways of associativity              = 0x8 (8)
       number of sets                     = 0x200 (512)
       WBINVD/INVD acts on lower caches   = false
       inclusive to lower caches          = true
       complex cache indexing             = false
       number of sets (s)                 = 512
       (size synth)                       = 262144 (256 KB)
       --- cache 3 ---
       cache type                         = unified cache (3)
       cache level                        = 0x3 (3)
       self-initializing cache level      = true
       fully associative cache            = false
       maximum IDs for CPUs sharing cache = 0x0 (0)
       maximum IDs for cores in pkg       = 0x0 (0)
       system coherency line size         = 0x40 (64)
       physical line partitions           = 0x1 (1)
       ways of associativity              = 0x10 (16)
       number of sets                     = 0x2000 (8192)
       WBINVD/INVD acts on lower caches   = true
       inclusive to lower caches          = true
       complex cache indexing             = false
       number of sets (s)                 = 8192
       (size synth)                       = 8388608 (8 MB)
       --- cache 4 ---
       cache type                         = no more caches (0)

Signed-off-by: Ewan Hai <ewanhai...@zhaoxin.com>
Signed-off-by: Zhao Liu <zhao1....@intel.com>
---
Changes on the original codes:
  * Rearrange cache model fields to make them easier to check.
  * And add explanation of why v3 is needed.
  * Drop lines_per_tag field for L2 & L3.
---
  target/i386/cpu.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++
  1 file changed, 104 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index a7f2e5dd3fcb..08c84ba90f52 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3159,6 +3159,105 @@ static const CPUCaches xeon_srf_cache_info = {
      },
  };

+static const CPUCaches yongfeng_cache_info = {
+    .l1d_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x0.EAX */
+        .type = DATA_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x0.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x0.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x0.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        /* CPUID 0x80000005.ECX */
+        .lines_per_tag = 1,
+        .size = 32 * KiB,
+
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l1i_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x1.EAX */
+        .type = INSTRUCTION_CACHE,
+        .level = 1,
+        .self_init = true,
+
+        /* CPUID 0x4.0x1.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x1.ECX */
+        .sets = 64,
+
+        /* CPUID 0x4.0x1.EDX */
+        .no_invd_sharing = false,
+        .inclusive = false,
+        .complex_indexing = false,
+
+        /* CPUID 0x80000005.EDX */
+        .lines_per_tag = 1,
+        .size = 64 * KiB,
+
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l2_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x2.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 2,
+        .self_init = true,
+
+        /* CPUID 0x4.0x2.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 8,
+
+        /* CPUID 0x4.0x2.ECX */
+        .sets = 512,
+
+        /* CPUID 0x4.0x2.EDX */
+        .no_invd_sharing = false,
+        .inclusive = true,
+        .complex_indexing = false,
+
+        /* CPUID 0x80000006.ECX */
+        .size = 256 * KiB,
+
+        .share_level = CPU_TOPOLOGY_LEVEL_CORE,
+    },
+    .l3_cache = &(CPUCacheInfo) {
+        /* CPUID 0x4.0x3.EAX */
+        .type = UNIFIED_CACHE,
+        .level = 3,
+        .self_init = true,
+
+        /* CPUID 0x4.0x3.EBX */
+        .line_size = 64,
+        .partitions = 1,
+        .associativity = 16,
+
+        /* CPUID 0x4.0x3.ECX */
+        .sets = 8192,
+
+        /* CPUID 0x4.0x3.EDX */
+        .no_invd_sharing = true,
+        .inclusive = true,
+        .complex_indexing = false,
+
+        .size = 8 * MiB,
+        .share_level = CPU_TOPOLOGY_LEVEL_DIE,
+    },
+};
+
  /* The following VMX features are not supported by KVM and are left out in the
   * CPU definitions:
   *
@@ -6438,6 +6537,11 @@ static const X86CPUDefinition builtin_x86_defs[] = {
                      { /* end of list */ }
                  }
              },
+            {
+                .version = 3,
+                .note = "with the cache info",

I realize that my previous use of "cache info" was not precise; "cache model" is more appropriate. Please help me adjust accordingly, thank you.

+                .cache_info = &yongfeng_cache_info
+            },
              { /* end of list */ }
          }
      },
--
2.34.1


Hi Zhao,

I tested the patchsets you provided on different hosts, and here are the 
results:

1. On an Intel host with KVM enabled
The CPUID leaves 0x2 and 0x4 reported inside the YongFeng-V3 VM match our expected cache details exactly. However, CPUID leaf 0x80000005 returns all zeros. This is because when KVM is in use, QEMU uses the host's vendor for the IS_INTEL_CPU(env), IS_ZHAOXIN_CPU(env), and IS_AMD_CPU(env) checks. Given that behavior, a zeroed 0x80000005 leaf in the guest is expected and, to me, acceptable. What are your thoughts?

2. On a YongFeng host (with or without KVM)
The CPUID leaves 0x2, 0x4, and 0x80000006 inside the VM all return the values we want, and the L1D/L1I cache info in leaf 0x80000005 is also correct.

3. TLB info in leaf 0x80000005
On both Intel and YongFeng hosts, the L1 TLB fields in leaf 0x80000005 remain constant, as we discussed. As you mentioned before, "we can wait and see what maintainers think" about this.

In summary, both patchsets look good for Zhaoxin support, I don't see any issues so far.

Btw, YongFeng host also support 0x1F, does YongFeng need to turn on "x-force-cpuid-0x1f" default ? I think maybe yes.


Best regards,
Ewan



Reply via email to