On Mon, May 19, 2025 at 04:36:26PM +0100, Alireza Sanaee wrote:
> Specifying the cache layout in virtual machines is useful for
> applications and operating systems to fetch accurate information about
> the cache structure and make appropriate adjustments. Enforcing correct
> sharing information can lead to better optimizations. Patches that allow
> for an interface to express caches was landed in the prior cycles. This
> patchset uses the interface as a foundation.  Thus, the device tree and
> ACPI/PPTT table, and device tree are populated based on
> user-provided information and CPU topology.


Not sure why doesn't this apply anymore. Can you rebase and repost pls?

> Example:
> 
> 
> +----------------+                            +----------------+
> |    Socket 0    |                            |    Socket 1    |
> |    (L3 Cache)  |                            |    (L3 Cache)  |
> +--------+-------+                            +--------+-------+
>          |                                             |
> +--------+--------+                            +--------+--------+
> |   Cluster 0     |                            |   Cluster 0     |
> |   (L2 Cache)    |                            |   (L2 Cache)    |
> +--------+--------+                            +--------+--------+
>          |                                             |
> +--------+--------+  +--------+--------+    +--------+--------+  
> +--------+----+
> |   Core 0         | |   Core 1        |    |   Core 0        |  |   Core 1   
>  |
> |   (L1i, L1d)     | |   (L1i, L1d)    |    |   (L1i, L1d)    |  |   (L1i, 
> L1d)|
> +--------+--------+  +--------+--------+    +--------+--------+  
> +--------+----+
>          |                   |                       |                   |
> +--------+              +--------+              +--------+          +--------+
> |Thread 0|              |Thread 1|              |Thread 1|          |Thread 0|
> +--------+              +--------+              +--------+          +--------+
> |Thread 1|              |Thread 0|              |Thread 0|          |Thread 1|
> +--------+              +--------+              +--------+          +--------+
> 
> 
> The following command will represent the system relying on **ACPI PPTT 
> tables**.
> 
> ./qemu-system-aarch64 \
>  -machine 
> virt,smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=cluseter,smp-cache.3.cache=l3,smp-cache.3.topology=socket
>  \
>  -cpu max \
>  -m 2048 \
>  -smp sockets=2,clusters=1,cores=2,threads=2 \
>  -kernel ./Image.gz \
>  -append "console=ttyAMA0 root=/dev/ram rdinit=/init acpi=force" \
>  -initrd rootfs.cpio.gz \
>  -bios ./edk2-aarch64-code.fd \
>  -nographic
> 
> The following command will represent the system relying on **the device 
> tree**.
> 
> ./qemu-system-aarch64 \
>  -machine 
> virt,smp-cache.0.cache=l1i,smp-cache.0.topology=core,smp-cache.1.cache=l1d,smp-cache.1.topology=core,smp-cache.2.cache=l2,smp-cache.2.topology=cluseter,smp-cache.3.cache=l3,smp-cache.3.topology=socket
>  \
>  -cpu max \
>  -m 2048 \
>  -smp sockets=2,clusters=1,cores=2,threads=2 \
>  -kernel ./Image.gz \
>  -append "console=ttyAMA0 root=/dev/ram rdinit=/init acpi=off" \
>  -initrd rootfs.cpio.gz \
>  -nographic
> 
> Failure cases:
>     1) There are scenarios where caches exist in systems' registers but
>     left unspecified by users. In this case qemu returns failure.
> 
>     2) SMT threads cannot share caches which is not very common. More
>     discussions here [1].
> 
> Currently only three levels of caches are supported to be specified from
> the command line. However, increasing the value does not require
> significant changes. Further, this patch assumes l2 and l3 unified
> caches and does not allow l(2/3)(i/d). The level terminology is
> thread/core/cluster/socket right now. Hierarchy assumed in this patch:
> Socket level = Cluster level + 1 = Core level + 2 = Thread level + 3;
> 
> TODO:
>   1) Making the code to work with arbitrary levels
>   2) Separated data and instruction cache at L2 and L3.
>   3) Additional cache controls.  e.g. size of L3 may not want to just
>   match the underlying system, because only some of the associated host
>   CPUs may be bound to this VM.
> 
> [1] 
> https://lore.kernel.org/devicetree-spec/[email protected]/
> 
> Change Log:
>   v10->v11:
>    * Fix some coding style issues.
>    * Rename some variables.
> 
>   v9->v10:
>    * PPTT rev down to 2.
> 
>   v8->v9:
>    * rebase to 10
>    * Fixed a bug in device-tree generation related to a scenario when
>         caches are shared at core in higher levels than 1.
>   v7->v8:
>    * rebase: Merge tag 'pull-nbd-2024-08-26' of https://repo.or.cz/qemu/ericb 
> into staging
>    * I mis-included a file in patch #4 and I removed it in this one.
> 
>   v6->v7:
>    * Intel stuff got pulled up, so rebase.
>    * added some discussions on device tree.
> 
>   v5->v6:
>    * Minor bug fix.
>    * rebase based on new Intel patchset.
>      - 
> https://lore.kernel.org/qemu-devel/[email protected]/
> 
>   v4->v5:
>     * Added Reviewed-by tags.
>     * Applied some comments.
> 
>   v3->v4:
>     * Device tree added.
> 
> Depends-on: Building PPTT with root node and identical implementation flag
> Depends-on: Msg-id: [email protected]
> 
> Alireza Sanaee (6):
>   target/arm/tcg: increase cache level for cpu=max
>   arm/virt.c: add cache hierarchy to device tree
>   bios-tables-test: prepare to change ARM ACPI virt PPTT
>   hw/acpi/aml-build.c: add cache hierarchy to pptt table
>   tests/qtest/bios-table-test: testing new ARM ACPI PPTT topology
>   Update the ACPI tables according to the acpi aml_build change, also
>     empty bios-tables-test-allowed-diff.h.
> 
>  hw/acpi/aml-build.c                        | 220 ++++++++++++-
>  hw/arm/virt-acpi-build.c                   |   8 +-
>  hw/arm/virt.c                              | 359 +++++++++++++++++++++
>  hw/cpu/core.c                              |  92 ++++++
>  hw/loongarch/virt-acpi-build.c             |   2 +-
>  include/hw/acpi/aml-build.h                |   4 +-
>  include/hw/arm/virt.h                      |   5 +
>  include/hw/cpu/core.h                      |  27 ++
>  target/arm/tcg/cpu64.c                     |  13 +
>  tests/data/acpi/aarch64/virt/PPTT.topology | Bin 356 -> 540 bytes
>  tests/qtest/bios-tables-test.c             |   4 +
>  11 files changed, 724 insertions(+), 10 deletions(-)
> 
> -- 
> 2.43.0


Reply via email to