> -----Original Message----- > From: Shameerali Kolothum Thodi > Sent: 26 August 2022 13:15 > To: 'Laszlo Ersek' <[email protected]>; [email protected]; > [email protected] > Cc: [email protected]; [email protected]; Linuxarm > <[email protected]>; chenxiang (M) <[email protected]>; Ard > Biesheuvel (kernel.org address) <[email protected]>; Gerd Hoffmann > <[email protected]> > Subject: RE: [PATCH] fw_cfg: Don't set callback_opaque NULL in > fw_cfg_modify_bytes_read() > > > > > -----Original Message----- > > From: Laszlo Ersek [mailto:[email protected]] > > Sent: 26 August 2022 13:07 > > To: Shameerali Kolothum Thodi > <[email protected]>; > > [email protected]; [email protected] > > Cc: [email protected]; [email protected]; Linuxarm > > <[email protected]>; chenxiang (M) <[email protected]>; > Ard > > Biesheuvel (kernel.org address) <[email protected]>; Gerd Hoffmann > > <[email protected]> > > Subject: Re: [PATCH] fw_cfg: Don't set callback_opaque NULL in > > fw_cfg_modify_bytes_read() > > > > +Ard +Gerd, one pointer at the bottom > > > > On 08/26/22 13:59, Laszlo Ersek wrote: > > > On 08/25/22 18:18, Shameer Kolothum wrote: > > >> Hi > > >> > > >> On arm/virt platform, Chen Xiang reported a Guest crash while > > >> attempting the below steps, > > >> > > >> 1. Launch the Guest with nvdimm=on > > >> 2. Hot-add a NVDIMM dev > > >> 3. Reboot > > >> 4. Guest boots fine. > > >> 5. Reboot again. > > >> 6. Guest boot fails. > > >> > > >> QEMU_EFI reports the below error: > > >> ProcessCmdAddPointer: invalid pointer value in "etc/acpi/tables" > > >> OnRootBridgesConnected: InstallAcpiTables: Protocol Error > > >> > > >> Debugging shows that on first reboot(after hot-adding NVDIMM), > > >> Qemu updates the etc/table-loader len, > > >> > > >> qemu_ram_resize() > > >> fw_cfg_modify_file() > > >> fw_cfg_modify_bytes_read() > > >> > > >> And in fw_cfg_modify_bytes_read() we set the "callback_opaque" for > > >> the "key" entry to NULL. Because of this, on the second reboot, > > >> virt_acpi_build_update() is called with a NULL "build_state" and > > >> returns without updating the ACPI tables. This seems to be > > >> upsetting the firmware. > > >> > > >> To fix this, don't change the callback_opaque in > > fw_cfg_modify_bytes_read(). > > >> > > >> Reported-by: chenxiang <[email protected]> > > >> Signed-off-by: Shameer Kolothum > > <[email protected]> > > >> --- > > >> I am still not very convinced this is the root cause of the issue. > > >> Though it looks like setting callback_opaque to NULL while updating > > >> the file size is wrong, what puzzles me is that on the second reboot > > >> we don't have any ACPI table size changes and ideally firmware should > > >> see the updated tables from the first reboot itself. > > >> > > >> Please take a look and let me know. > > >> > > >> Thanks, > > >> Shameer > > >> > > >> --- > > >> hw/nvram/fw_cfg.c | 1 - > > >> 1 file changed, 1 deletion(-) > > >> > > >> diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c > > >> index d605f3f45a..dfe8404c01 100644 > > >> --- a/hw/nvram/fw_cfg.c > > >> +++ b/hw/nvram/fw_cfg.c > > >> @@ -728,7 +728,6 @@ static void > > *fw_cfg_modify_bytes_read(FWCfgState *s, uint16_t key, > > >> ptr = s->entries[arch][key].data; > > >> s->entries[arch][key].data = data; > > >> s->entries[arch][key].len = len; > > >> - s->entries[arch][key].callback_opaque = NULL; > > >> s->entries[arch][key].allow_write = false; > > >> > > >> return ptr; > > >> > > > > > > I vaguely recall seeing the same issue report years ago (also in > > > relation to hot-adding NVDIMM). However, I have no capacity to > > > participate in the discussion. Making this remark just for clarity. > > > > The earlier report I've had in mind was from Shameer as well: > > > > > http://mid.mail-archive.com/5FC3163CFD30C246ABAA99954A238FA83F3F > > [email protected] > > Right. That was a slightly different issue though. It was basically ACPI table > size not > getting updated on the first reboot of Guest after we hot-add NVDIMM dev. > The error > from firmware was different in that case, > > ProcessCmdAddChecksum: invalid checksum range in "etc/acpi/tables" > OnRootBridgesConnected: InstallAcpiTables: Protocol Error > > And it was fixed with this series here, > https://patchwork.kernel.org/project/qemu-devel/cover/20200403101827.3 > [email protected]/ > > The current issue only happens on the second reboot of the Guest as > described in > the steps above. >
[+Christian] I just found that a similar issue was reported here sometime back on Q35/Windows setup, https://patchew.org/QEMU/[email protected]/ But there are no further discussions on that thread. Thanks, Shameer
