On Thu, Oct 01, 2020 at 01:43:52PM +0100, Andrew Cooper wrote:
> On 01/10/2020 13:31, Marek Marczykowski-Górecki wrote:
> > On Thu, Oct 01, 2020 at 01:59:32PM +0200, Jan Beulich wrote:
> >> On 01.10.2020 03:12, Marek Marczykowski-Górecki wrote:
> >>> After patching the previous issue ("x86/S3: Fix Shadow Stack resume
> >>> path") I still encounter issues resume from S3.
> >>> Since I had it working on Xen 4.13 on this particular hardware (Thinkpad
> >>> P52), I bisected it and got this:
> >>>
> >>> commit 4304ff420e51b973ec9eb9dafd64a917dd9c0fb1
> >>> Author: Andrew Cooper <[email protected]>
> >>> Date: Wed Dec 11 20:59:19 2019 +0000
> >>>
> >>> x86/S3: Drop {save,restore}_rest_processor_state() completely
> >>>
> >>> There is no need to save/restore FS/GS/XCR0 state. It will be handled
> >>> suitably on the context switch away from the idle.
> >>>
> >>> The CR4 restoration in restore_rest_processor_state() was actually
> >>> fighting
> >>> later code in enter_state() which tried to keep CR4.MCE clear until
> >>> everything
> >>> was set up. Delete the intermediate restoration, and defer final
> >>> restoration
> >>> until after MCE is reconfigured.
> >>>
> >>> Restoring PAT can be done earlier, and ideally before paging is
> >>> enabled. By
> >>> moving it into the trampoline during the setup for 64bit, the call
> >>> can be
> >>> dropped from cpu_init(). The EFI path boot path doesn't disable
> >>> paging, so
> >>> make the adjustment when switching onto Xen's pagetables.
> >>>
> >>> The only remaing piece of restoration is load_system_tables(), so
> >>> suspend.c
> >>> can be deleted in its entirety.
> >>>
> >>> Signed-off-by: Andrew Cooper <[email protected]>
> >>> Reviewed-by: Jan Beulich <[email protected]>
> >>>
> >>> Parent of this commit suspends and resumes just fine. With this commit
> >>> applied, it (I think) it panics, at least I get reboot after 5s. Sadly, I
> >>> don't have serial console there.
> >>>
> >>> I tried also master and stable-4.14 with this commit reverted (and also
> >>> the other fix applied), but it doesn't work. In this case I get a hang on
> >>> resume (power led still flashing, but fan woke up). There are probably
> >>> some other dependencies.
> >> Since bisection may also point you at some intermediate breakage, which
> >> these last results of yours seem to support, could you check whether
> >> 55f8c389d434 put immediately on top of the above commit makes a difference,
> >> and if so resume bisecting from there?
> > Nope, 4304ff420e51b973ec9eb9dafd64a917dd9c0fb1 with 55f8c389d434 on top
> > it still hangs on resume.
>
> Ok. I'll see about breaking the change apart so we can bisect which
> specific bit of code movement broke things.I've done another bisect on the commit broken up in separate changes (https://xenbits.xen.org/gitweb/?p=people/andrewcoop/xen.git;a=shortlog;h=refs/heads/dbg-s3) and the bad part seems to be this: From dbdb32f8c265295d6af7cd4cd0aa12b6d04a0430 Mon Sep 17 00:00:00 2001 From: Andrew Cooper <[email protected]> Date: Fri, 2 Oct 2020 15:40:22 +0100 Subject: [PATCH 1/1] CR4 --- xen/arch/x86/acpi/power.c | 9 ++++----- xen/arch/x86/acpi/suspend.c | 3 --- 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c index 6dfd4c7891..0cda362045 100644 --- a/xen/arch/x86/acpi/power.c +++ b/xen/arch/x86/acpi/power.c @@ -195,7 +195,6 @@ static int enter_state(u32 state) unsigned long flags; int error; struct cpu_info *ci; - unsigned long cr4; if ( (state <= ACPI_STATE_S0) || (state > ACPI_S_STATES_MAX) ) return -EINVAL; @@ -270,15 +269,15 @@ static int enter_state(u32 state) system_state = SYS_STATE_resume; - /* Restore CR4 and EFER from cached values. */ - cr4 = read_cr4(); - write_cr4(cr4 & ~X86_CR4_MCE); + /* Restore EFER from cached value. */ write_efer(read_efer()); device_power_up(SAVED_ALL); mcheck_init(&boot_cpu_data, false); - write_cr4(cr4); + + /* Restore CR4 from cached value, now MCE is set up. */ + write_cr4(read_cr4()); printk(XENLOG_INFO "Finishing wakeup from ACPI S%d state.\n", state); diff --git a/xen/arch/x86/acpi/suspend.c b/xen/arch/x86/acpi/suspend.c index 060a9313b6..ca987d9019 100644 --- a/xen/arch/x86/acpi/suspend.c +++ b/xen/arch/x86/acpi/suspend.c @@ -23,7 +23,4 @@ void save_rest_processor_state(void) void restore_rest_processor_state(void) { load_system_tables(); - - /* Restore full CR4 (inc MCE) now that the IDT is in place. */ - write_cr4(mmu_cr4_features); } -- 2.20.1 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing?
signature.asc
Description: PGP signature
