On Fri, 28 Feb 2025 17:36:08 +0800 Jonathan Cameron <jonathan.came...@huawei.com> wrote:
> On Thu, 27 Feb 2025 17:00:56 +0100 > Mauro Carvalho Chehab <mchehab+hua...@kernel.org> wrote: > > > While the HEST layout didn't change, there are some internal > > changes related to how offsets are calculated and how memory error > > events are triggered. > > > > Update specs to reflect such changes. > > > > Signed-off-by: Mauro Carvalho Chehab <mchehab+hua...@kernel.org> > One minor editorial suggestion. With that or similar tidy up, > Reviewed-by: Jonathan Cameron <jonathan.came...@huawei.com> with nit below fixed, Reviewed-by: Igor Mammedov <imamm...@redhat.com> > > > --- > > docs/specs/acpi_hest_ghes.rst | 28 +++++++++++++++++----------- > > 1 file changed, 17 insertions(+), 11 deletions(-) > > > > diff --git a/docs/specs/acpi_hest_ghes.rst b/docs/specs/acpi_hest_ghes.rst > > index c3e9f8d9a702..4311a9536b21 100644 > > --- a/docs/specs/acpi_hest_ghes.rst > > +++ b/docs/specs/acpi_hest_ghes.rst > > @@ -89,12 +89,21 @@ Design Details > > addresses in the "error_block_address" fields with a pointer to the > > respective "Error Status Data Block" in the "etc/hardware_errors" blob. > > > > -(8) QEMU defines a third and write-only fw_cfg blob which is called > > - "etc/hardware_errors_addr". Through that blob, the firmware can send > > back > > - the guest-side allocation addresses to QEMU. The > > "etc/hardware_errors_addr" > > - blob contains a 8-byte entry. QEMU generates a single WRITE_POINTER > > command > > - for the firmware. The firmware will write back the start address of > > - "etc/hardware_errors" blob to the fw_cfg file > > "etc/hardware_errors_addr". > > +(8) QEMU defines a third and write-only fw_cfg blob to store the location > > + where the error block offsets, read ack registers and CPER records are > > + stored. > > + > > + Up to QEMU 9.2, the location was at "etc/hardware_errors_addr", and > > + contains a GPA for the beginning of "etc/hardware_errors". > > + > > + Newer versions place the location at "etc/acpi_table_hest_addr", > > + pointing to the GPA of the HEST table. > > + > > + Through that such GPA values, the firmware can send back the > > guest-side > This confuses me. > Via those GPA values...? (maybe?) it's not GPA here, it should be fwcfg. Maybe something like this "Using above mentioned 'fwcfg' files," > > > + allocation addresses to QEMU. They contain a 8-byte entry. QEMU > > generates > > + a single WRITE_POINTER command for the firmware. The firmware will > > write > > + back the start address of either "etc/hardware_errors" or HEST table at > > + the corresponding fw_cfg file. > > > > (9) When QEMU gets a SIGBUS from the kernel, QEMU writes CPER into > > corresponding > > "Error Status Data Block", guest memory, and then injects platform > > specific > > @@ -105,8 +114,5 @@ Design Details > > kernel, on receiving notification, guest APEI driver could read the > > CPER error > > and take appropriate action. > > > > -(11) kvm_arch_on_sigbus_vcpu() uses source_id as index in > > "etc/hardware_errors" to > > - find out "Error Status Data Block" entry corresponding to error > > source. So supported > > - source_id values should be assigned here and not be changed > > afterwards to make sure > > - that guest will write error into expected "Error Status Data Block" > > even if guest was > > - migrated to a newer QEMU. > > +(11) kvm_arch_on_sigbus_vcpu() report RAS errors via a SEA notifications, > > + when a SIGBUS event is triggered. >