Hi Laszlo/Igor,

I spend some time to debug this further as I was rebasing the nvdimm
hot-add support patches on top of the ongoing pc-dimm hot add ones.

Just to refresh the memory:

https://patchwork.kernel.org/cover/10783589/

" It is observed that hot adding nvdimm will results in guest reboot
failure. EDK2 fails to build the ACPI tables on reboot. Please find
below EDK2 log on Guest reboot after nvdimm hot-add,

ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables"
OnRootBridgesConnected: InstallAcpiTables: Protocol Error
"

Please find below,

> -----Original Message-----
> From: Laszlo Ersek [mailto:ler...@redhat.com]
> Sent: 05 March 2019 12:15
> To: Igor Mammedov <imamm...@redhat.com>
> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.th...@huawei.com>;
> Auger Eric <eric.au...@redhat.com>; shannon.zha...@gmail.com;
> peter.mayd...@linaro.org; qemu-devel@nongnu.org; qemu-...@nongnu.org;
> xuwei (O) <xuw...@huawei.com>; Linuxarm <linux...@huawei.com>; Ard
> Biesheuvel <ard.biesheu...@linaro.org>; Leif Lindholm (Linaro address)
> <leif.lindh...@linaro.org>
> Subject: Re: [RFC PATCH 0/4] ARM virt: ACPI memory hotplug support
> 
> On 03/01/19 18:39, Igor Mammedov wrote:
> > On Fri, 1 Mar 2019 14:49:45 +0100
> > Laszlo Ersek <ler...@redhat.com> wrote:
> >
> >> On 02/28/19 15:02, Shameerali Kolothum Thodi wrote:
> >>
> >>> Ah..I missed the fact that, firmware indeed sees an update in the blob len
> here
> >>> (rounded or not) after reboot. So don’t think x86 has the same issue and
> padding
> >>> is not the right solution as Igor explained in his reply.
> >>>
> >>> I will try to debug this further. Any pointers welcome.
> >>
> >> How about this.
> >>
> >> (1) The firmware looks up the fw_cfg file called "etc/table-loader" in
> >> the fw_cfg file directory (identified by constant selector key 0x0019,
> >> FW_CFG_FILE_DIR).
> >>
> >> (2) The directory entry, once found, tells the firmware two things
> >> simultaneously. The selector key, and the size of the blob.
> >>
> >> (3) The firmware selects the selector key from step (2).
> >>
> >> (4) QEMU regenerates the ACPI payload (as a select callback).
> >>
> >> (5) The firmware reads the number of bytes from the fw_cfg blob that it
> >> learned in step (2).
> >>
> >> Here's the problem. As long as QEMU used to perform step (4) only for
> >> the purpose of refreshing PCI resources in the ACPI payload, step (4)
> >> wouldn't *resize* the blob.
> >>
> >> However, if step (4) enlarges the blob, then the byte count that step
> >> (5) uses -- from step (2) -- for reading, is obsolete.
> 
> > I've thought that was a problem with IO based fw_cfg, as reading
> size/content
> > were separates steps and that it was solved by DMA based fw_cfg file read.
> 
> The DMA backend is not relevant for this question, for two reasons:
> 
> (a) The question whether the fw_cfg transfer takes places with port IO
> vs. DMA is hidden from the fw_cfg client code; that code goes through an
> abstract library API.
> 
> (b) While the DMA method indeed lets the firmware specify the details of
> the transfer with one action, the issue is with the number of bytes that
> the firmware requests (that is, not with *how* the firmware requests the
> transfer). The firmware has to know the size of the transfer before it
> can initiate the transfer (regardless of port IO vs. DMA).
> 
> 
> My question is: assume the firmware item in question is selected, and
> the QEMU-side select callback runs (regenerating the ACPI payload). Does
> this action update the blob size in the fw_cfg file directory as well?

I think it doesn’t update the blob size on select callback which is the root
cause of this issue. And the reason looks like, qemu_ram_resize() function
returns without invoking the callback to update the blob size.
 
On boot up, Qemu builds the table and exposes it to guest,
      virt_acpi_setup()
        acpi_add_rom_blob()
          rom_add_blob()
            rom_set_mr()  --> mr is allocated here and ram_block used_length = 
HOST_PAGE_ALIGN(blob size);
            fw_cfg_add_file_callback()
              fw_cfg_add_bytes_callback() --> This uses the blob size passed 
into it.

On select callback path,

virt_acpi_build_update()
   acpi_ram_update()
    memory_region_ram_resize()
      qemu_ram_resize() -->. Here the newsize gets aligned to HOST_PAGE and 
callback is only called used_length != newsize.

https://github.com/qemu/qemu/blob/master/exec.c#L2180

Debug logs:
Initial boot:
##QEMU_DEBUG## rom_add_blob: file etc/acpi/tables size 0x64f7
##QEMU_DEBUG## fw_cfg_add_bytes_callback: key 0x21 len 0x64f7
........
........
###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem 0x27 size 
0xD00
##QEMU_DEBUG## virt_acpi_build_update:
##QEMU_DEBUG## acpi_ram_update: size 0x64f7
##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables used_length  0x7000 
newsize 0x7000 --> No callback.
.....
######UEFI###### ProcessCmdAllocate: QemuFwCfgFindFile("etc/acpi/tables"): size 
0x64F7 --> UEFI get the actual size, which is fine for now.

Hot-add nvdimms and reboot.

root@ubuntu:/# reboot
.......
........
###UEFI#### InstallQemuFwCfgTables: "etc/table-loader" has FwCfgItem 0x27 size 
0xD00
##QEMU_DEBUG## virt_acpi_build_update:
##QEMU_DEBUG## acpi_ram_update: size 0x6667 --> Size changed.
##QEMU_DEBUG## qemu_ram_resize: idstr /rom@etc/acpi/tables used_length  0x7000 
newsize 0x7000 --> newsize is still aligned to 0x7000 and no callback to update.
......
######UEFI###### ProcessCmdAllocate: QemuFwCfgFindFile("etc/acpi/tables"): size 
0x64F7 -->UEFI still sees the old value and fails.

This can be fixed by,

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index 
f3bd45675b..79da3fd35d 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -854,6 +854,9 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables 
*tables)
         build_rsdp(tables->rsdp, tables->linker, &rsdp_data);
     }

+    g_array_set_size(tables_blob,
+                    TARGET_PAGE_ALIGN(acpi_data_len(tables_blob)));
+
     /* Cleanup memory that's no longer used. */
     g_array_free(table_offsets, true);
 }

But I am not sure this is the best to way fix this issue (Or I am missing 
something here).

Please let me know.

Thanks,
Shameer

 
> If it does, then I can work around the problem in the firmware. I can
> add a re-lookup to the code after the item selection, in order to get
> the fresh blob size from the fw_cfg file directory. Then we can use that
> size for the actual transfer.
> 
> This won't help old firmware on new QEMU, but at least new firmware on
> old QEMU will not be hurt (the re-fetching of the fw_cfg file directory
> will come with a small performance penalty, but functionally it will be
> a no-op).
> 
> Thanks
> Laszlo

Reply via email to