Hi Michal, Thanks for the review -- I think there are two separate concerns here (domain state vs. ARM-specific resume context), and it’s easy to conflate them.
On Thu, Jan 15, 2026 at 11:03 AM Orzel, Michal <[email protected]> wrote: > > > > On 12/12/2025 14:18, Mykola Kvach wrote: > > From: Mykola Kvach <[email protected]> > > > > Add support for the PSCI SYSTEM_SUSPEND function in the vPSCI interface, > > allowing guests to request suspend via the PSCI v1.0+ SYSTEM_SUSPEND call > > (both 32-bit and 64-bit variants). > > > > Implementation details: > > - Add SYSTEM_SUSPEND function IDs to PSCI definitions > > - Trap and handle SYSTEM_SUSPEND in vPSCI > > - Allow only non-hardware domains to invoke SYSTEM_SUSPEND; return > > PSCI_NOT_SUPPORTED for the hardware domain to avoid halting the system > > in hwdom_shutdown() via domain_shutdown > > - Require all secondary VCPUs of the calling domain to be offline before > > suspend, as mandated by the PSCI specification > > > > The arch_domain_resume() function is an architecture-specific hook that is > > invoked during domain resume to perform any necessary setup or restoration > > steps required by the platform. arch_domain_resume() stays int to propagate > > errno-style detail into common logging; preserving the integer keeps the > > reason visible and leaves room for future arch-specific failures or richer > > handling. > > > > The new vpsci_vcpu_up_prepare() helper is called on the resume path to set > > up > > the vCPU context (such as entry point, some system regs and context ID) > > before > > resuming a suspended guest. This keeps ARM/vPSCI-specific logic out of > > common > > code and avoids intrusive changes to the generic resume flow. > > > > Usage: > > > > For Linux-based guests, suspend can be initiated with: > > echo mem > /sys/power/state > > or via: > > systemctl suspend > > > > Resuming the guest is performed from control domain using: > > xl resume <domain> > > > > Signed-off-by: Mykola Kvach <[email protected]> > > --- > > Changes in V16: > > - Refactor error handling in domain_resume: move logging to generic code, > > use explicit return code checking. > > - Make context clearing conditional on success in arch_domain_resume. > > - The 'int' return type is retained for arch_domain_resume for consistency > > with other arch hooks and to allow for specific negative error codes. > > --- > > xen/arch/arm/domain.c | 39 +++++++++ > > xen/arch/arm/include/asm/domain.h | 2 + > > xen/arch/arm/include/asm/perfc_defn.h | 1 + > > xen/arch/arm/include/asm/psci.h | 2 + > > xen/arch/arm/include/asm/suspend.h | 27 ++++++ > > xen/arch/arm/include/asm/vpsci.h | 5 +- > > xen/arch/arm/vpsci.c | 116 +++++++++++++++++++++----- > > xen/common/domain.c | 10 +++ > > xen/include/xen/suspend.h | 25 ++++++ > > 9 files changed, 205 insertions(+), 22 deletions(-) > > create mode 100644 xen/arch/arm/include/asm/suspend.h > > create mode 100644 xen/include/xen/suspend.h > > > > diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c > > index 47973f99d9..f903e7d4f0 100644 > > --- a/xen/arch/arm/domain.c > > +++ b/xen/arch/arm/domain.c > > @@ -12,6 +12,8 @@ > > #include <xen/softirq.h> > > #include <xen/wait.h> > > > > +#include <public/sched.h> > > + > > #include <asm/arm64/sve.h> > > #include <asm/cpuerrata.h> > > #include <asm/cpufeature.h> > > @@ -24,10 +26,12 @@ > > #include <asm/platform.h> > > #include <asm/procinfo.h> > > #include <asm/regs.h> > > +#include <asm/suspend.h> > > #include <asm/firmware/sci.h> > > #include <asm/tee/tee.h> > > #include <asm/vfp.h> > > #include <asm/vgic.h> > > +#include <asm/vpsci.h> > > #include <asm/vtimer.h> > > > > #include "vpci.h" > > @@ -851,6 +855,41 @@ void arch_domain_creation_finished(struct domain *d) > > p2m_domain_creation_finished(d); > > } > > > > +int arch_domain_resume(struct domain *d) > > +{ > > + int rc; > > + struct resume_info *ctx = &d->arch.resume_ctx; > > + > > + if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend ) > How does this check and returning -EINVAL correspond to... The if ( !d->is_shutting_down || d->shutdown_code != SHUTDOWN_suspend ) guard is meant to validate the domain state for the DOMCTL resumedomain entry point (i.e. reject an xl resume on a domain that isn’t in the "suspend shutdown" state). Suspend requests issued via SCHEDOP_shutdown / SCHEDOP_remote_shutdown with reason SUSPEND still put the domain into is_shutting_down=1 and shutdown_code=SHUTDOWN_suspend, so they do pass this state check. What the comment below was trying to say is different: those hypercall paths don’t go through vPSCI SYSTEM_SUSPEND, so they don’t populate the Arm-specific resume context (notably wake_cpu). In that case ctx->wake_cpu remains NULL and the Arm arch_domain_resume() returns early. > > > + { > > + dprintk(XENLOG_WARNING, > > + "%pd: Invalid domain state for resume: > > is_shutting_down=%u, shutdown_code=%u\n", > > + d, d->is_shutting_down, d->shutdown_code); > > + return -EINVAL; > > + } > > + > > + /* > > + * It is still possible to call domain_shutdown() with a suspend reason > > + * via some hypercalls, such as SCHEDOP_shutdown or > > SCHEDOP_remote_shutdown. > > + * In these cases, the resume context will be empty. > this comment? This patch assumes that we can now resume successfully (i.e. > this > function returns 0 and common domain_resume can continue) only if the shutdown > was with SCHEDOP_shutdown. Anything else will infinitely keep the vCPUS > paused. Separately (and this is why I’m hesitant to make domain_resume() "suspend-only"): domain_resume() is also used by the soft-reset flow. Today soft-reset is effectively x86-only (gated by HAS_SOFT_RESET), but the core plumbing is in common code and is intentionally generic -- the soft-reset calling chain ends up using domain_resume() as a generic helper. If domain_resume() itself starts rejecting anything other than SHUTDOWN_suspend, it would also be a future trap if/when someone enables HAS_SOFT_RESET on Arm. > > Other than that, the patch looks good. > > ~Michal > Best regards, Mykola
