+Ard, +Drew On 11/03/20 13:39, Igor Mammedov wrote: > On Fri, 30 Oct 2020 10:50:01 +0800 > Ying Fang <fangyi...@huawei.com> wrote: > >> Hi, >> >> I have a question on UEFI/ACPI tables setup and probing on arm64 platform. > > CCing Laszlo, > who might know how it's implemented. > >> Currently on arm64 platform guest can be booted with both fdt and ACPI >> supported. If ACPI is enabled, [1] says the only defined method for >> passing ACPI tables to the kernel is via the UEFI system configuration >> table. So AFAIK, ACPI Should be dependent on UEFI.
That's correct. The ACPI entry point (RSD PTR) on AARCH64 is defined in terms of UEFI. >> >> What's more [2] says UEFI kernel support on the ARM architectures >> is only available through a *stub*. The stub populates the FDT /chosen >> node with some UEFI parameters describing the UEFI location info. Yes. >> >> So i dump /sys/firmware/fdt from the guest, it does have something like: >> >> /dts-v1/; >> >> / { >> #size-cells = <0x02>; >> #address-cells = <0x02>; >> >> chosen { >> linux,uefi-mmap-desc-ver = <0x01>; >> linux,uefi-mmap-desc-size = <0x30>; >> linux,uefi-mmap-size = <0x810>; >> linux,uefi-mmap-start = <0x04 0x3c0ce018>; >> linux,uefi-system-table = <0x04 0x3f8b0018>; >> bootargs = >> "BOOT_IMAGE=/vmlinuz-4.19.90-2003.4.0.0036.oe1.aarch64 >> root=/dev/mapper/openeuler-root ro rd.lvm.lv=openeuler/root >> rd.lvm.lv=openeuler/swap video=VGA-1:640x480-32@60me >> smmu.bypassdev=0x1000:0x17 smmu.bypassdev=0x1000:0x15 >> crashkernel=1024M,high video=efifb:off video=VGA-1:640x480-32@60me"; >> linux,initrd-end = <0x04 0x3a85a5da>; >> linux,initrd-start = <0x04 0x392f2000>; >> }; >> }; >> >> But the question is that I did not see any code adding the uefi >> in fdt chosen node in *arm_load_dtb* or anywhere else. That's because the "UEFI stub" is a part of the guest kernel. It wraps the guest kernel image into a UEFI application binary. For a while, the guest kernel runs as a UEFI application, stashing some UEFI artifacts in *a* device tree, and then (after some other heavy lifting) jumping into the kernel proper. >> Qemu only maps the OVMF binary file into a pflash device. >> So I'm really confused on how UEFI information is provided to >> guest by qemu. Does anybody know of the details about it ? It's complex, unfortunately. (1) QEMU always generates a DTB for the guest firmware. This DTB is placed at the base of the guest RAM. See the arm_load_dtb() call in virt_machine_done() [hw/arm/virt.c] in QEMU. I think. (2) QEMU generates ACPI content, and exposes it via fw_cfg. See the virt_acpi_setup() call in the same virt_machine_done() function [hw/arm/virt.c] in QEMU. (3) The fw_cfg device itself is apparent to the guest firmware via the DTB from point (1). See the following steps in edk2: (3a) "ArmVirtPkg/Library/PlatformPeiLib/PlatformPeiLib.c" This saves the initial DTB (from the base of guest RAM, where it could be overwritten by whatever) to a dynamically allocated area. This "stashing" occurs early. (3b) "ArmVirtPkg/FdtClientDxe/FdtClientDxe.c" This driver exposes the (dynamically reallocated / copied) DTB via a custom UEFI protocol to the rest of the firmware. (This happens much later.) This protocol / driver can be considered the "owner" of the stashed DTB from (3a). (3c) "ArmVirtPkg/Library/QemuFwCfgLib/QemuFwCfgLib.c" This is the fw_cfg device access library, discovering the fw_cfg registers via the above UEFI protocol. The library is linked into each firmware module that needs fw_cfg access. (4) The firmware interprets QEMU's DTB for actual content (parsing values, configuring hardware, accessing devices). This occurs in a whole bunch of locations, mostly via consuming the custom protocol from (3b). Some info that's needed very early is parsed out of the DTB right in step (3a). (5) The guest firmware has a dedicated driver that checks whether QEMU was configured with ACPI enabled or disabled, and publishes that choice to the rest of the firmware. This is necessary because some firmware actions / infrastructure parts cannot (must not) proceed until this decision has been interpreted. See in edk2: - ArmVirtPkg/PlatformHasAcpiDtDxe This driver keys off of the presence of the "etc/table-loader" fw_cfg file, coming from step (2), using the fw_cfg access library from step (3c). If ACPI was enabled on the QEMU cmdline, then the rest of the firmware is "level-triggered" to proceed with the ACPI infrastructure. Otherwise, the rest of the firmware is "level-triggered" that DT was chosen for the OS. ("Level-triggering" means the installation of custom NULL protocols, which permits drivers dependent on DT vs ACPI to be dispatched.) (6) If DT was selected (ACPI was disabled), per step (5), then FdtClientDxe (introduced under step (3b)) has another job: it forwards the original stashed DTB (see (3a)) to the guest OS. This "DTB forwarding" occurs through a particular UEFI config table; the GUID is B1B621D5-F19C-41A5-830B-D9152C69AAE0 -- known as DEVICE_TREE_GUID in the kernel ("include/linux/efi.h"). See the OnPlatformHasDeviceTree() function in "ArmVirtPkg/FdtClientDxe/FdtClientDxe.c", in edk2. (7) If ACPI was selected instead, according to step (5), then through the fw_cfg access described in (3c), the guest firmware "blindly" processes the ACPI payload from QEMU (from step (2)). This "blind processing" means that the guest firmware runs the "ACPI linker/loader script" (the "etc/table-loader" fw_cfg file), installing a number of ACPI tables for the guest OS. The guest firmware does not interpret the ACPI tables. "Installing ACPI tables" ultimately means exposing stuff under the particular UEFI config table that stands for the RSD PTR -- the GUID is 8868E871-E4F1-11D3-BC22-0080C73C8881. (Known as ACPI_20_TABLE_GUID in Linux, "include/linux/efi.h".) See the following in edk2: - OvmfPkg/AcpiPlatformDxe/QemuFwCfgAcpiPlatformDxe.inf In this case, the guest firmware does not forward QEMU's original DTB to the guest OS. (8) Ultimately, from the guest OS's point of view, a UEFI config table for *either* the RSD PTR (ACPI_20_TABLE_GUID) *or* QEMU's DTB (DEVICE_TREE_GUID) is going to exist. (9) (Ard, please correct the below if necessary; thanks.) The UEFI stub of the guest kernel (which is a UEFI application) uses a device tree as its main communication channel to the (later-started) kernel entry point, AIUI. The UEFI stub basically inverts the importance of the UEFI system table versus the device tree -- the UEFI stub *converts* the UEFI system table (the multitude of UEFI config tables) into a device tree. This is my understanding anyway. (9a) If ACPI was disabled on the QEMU command line, then the guest kernel *adopts* the device tree that was forwarded to it in (6), via the UEFI config table marked with DEVICE_TREE_GUID. (9b) If ACPI was enabled on the QEMU command line, then the UEFI stub creates a brand new (empty) device tree (AIUI). Either way, the UEFI system table is linked *under* the -- adopted or new -- device tree, through the "chosen" node. And so, if ACPI was enabled, the ACPI RSD PTR (coming from step (7)) becomes visible to the kernel proper as well, through the UEFI config table with ACPI_20_TABLE_GUID. I believe this is implemented under "drivers/firmware/efi/libstub" in the kernel tree. Thanks, Laszlo