On Wed, Oct 29, 2025 at 08:06:36PM +0000, Teddy Astie wrote:
> Le 29/10/2025 à 19:55, Roger Pau Monné a écrit :
> > On Wed, Oct 29, 2025 at 06:26:14PM +0000, Teddy Astie wrote:
> >> Introduce a new flag to force the x2APIC enabled and preventing a
> >> guest from switching back LAPIC to xAPIC mode.
> >>
> >> The semantics of this mode are based IA32_XAPIC_DISABLE_STATUS
> >> architectural MSR of Intel specification.
> >>
> >> Signed-off-by: Teddy Astie <[email protected]>
> > 
> > Thanks, some initial comments below.
> > 
> >> ---
> >> This feature can be useful for various reasons, starting with SEV as
> >> it is complicated (especially with SEV-ES) to handle MMIO, and legacy
> >> xAPIC is one thing that needs MMIO intercepts (and Linux uses it during
> >> boot unless x2APIC is initially enabled, even if it switches to
> >> x2apic afterward). It could also be interesting to reduce the attack
> >> surface of the hypervisor (by only exposing x2apic to the guest).
> >>
> >> As it can allow to have MMIO-less guest (using PVH), perhaps it can
> >> be enough for avoiding the problematic cases of virtualized INVLPGB
> >> (when we have it).
> >>
> >> In my testing, Linux, FreeBSD and PV-shim works fine with it; OVMF
> >> freezes for some reason, NetBSD doesn't support it (no x2apic support
> >> as Xen guest). HVM BIOS gets stuck at SeaBIOS as it expects booting
> >> with xAPIC.
> >>
> >> On Intel platforms, it would be better to expose the
> >> IA32_XAPIC_DISABLE_STATUS architectural MSR to advertise this to
> >> guest, but it's non-trivial as it needs to be properly exposed
> >> through IA32_ARCH_CAPABILITIES which is currently passed-through.
> >>
> >>   docs/man/xl.cfg.5.pod.in              |  7 +++++++
> >>   tools/libs/light/libxl_types.idl      |  1 +
> >>   tools/libs/light/libxl_x86.c          |  4 ++++
> >>   tools/xl/xl_parse.c                   |  1 +
> >>   xen/arch/x86/domain.c                 |  2 +-
> >>   xen/arch/x86/hvm/hvm.c                |  2 ++
> >>   xen/arch/x86/hvm/vlapic.c             | 23 ++++++++++++++++++++++-
> >>   xen/arch/x86/include/asm/domain.h     |  2 ++
> >>   xen/arch/x86/include/asm/hvm/domain.h |  3 +++
> >>   xen/include/public/arch-x86/xen.h     | 12 +++++++++++-
> >>   10 files changed, 54 insertions(+), 3 deletions(-)
> > 
> > Seeing there are no changes to the ACPI tables exposed to the guest,
> > do we want to start exposing X2APIC MADT entries instead of the plain
> > APIC entries?
> > 
> > The ACPI spec seems to suggest that you can expose APIC entries for
> > APICs below 255, for compatibility reasons.  But given that we would
> > force the guest to use X2APIC mode it would certainly need to
> > understand how to process X2APIC MADT entries anyway.
> > 
> > Not sure it makes much of a difference, but wondering whether OSes
> > expect X2APIC MADT entries if the mode is locked to X2APIC.
> > 
> 
> In all OS I checked, they see x2APIC MADT entries as a different format 
> for LAPIC entries and don't really link it with whether x2APIC is used 
> or not.
> 
> But I think it's safe to assume all OS that supports x2APIC has support 
> for x2APIC MADT entries, which could make ACPI table generation simpler 
> (especially for dealing with LAPIC IDs over 255)
> 
> >>
> >> diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
> >> index ad1553c5e9..01b41d93c0 100644
> >> --- a/docs/man/xl.cfg.5.pod.in
> >> +++ b/docs/man/xl.cfg.5.pod.in
> >> @@ -3198,6 +3198,13 @@ option.
> >>   
> >>   If using this option is necessary to fix an issue, please report a bug.
> >>   
> >> +=item B<force_x2apic=BOOLEAN>
> >> +
> >> +Force the LAPIC in x2APIC mode and prevent the guest from disabling
> >> +it or switching to xAPIC mode.
> >> +
> >> +This option is disabled by default.
> >> +
> >>   =back
> >>   
> >>   =head1 SEE ALSO
> >> diff --git a/tools/libs/light/libxl_types.idl 
> >> b/tools/libs/light/libxl_types.idl
> >> index d64a573ff3..b95278007e 100644
> >> --- a/tools/libs/light/libxl_types.idl
> >> +++ b/tools/libs/light/libxl_types.idl
> >> @@ -738,6 +738,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
> >>                                  ("arm_sci", libxl_arm_sci),
> >>                                 ])),
> >>       ("arch_x86", Struct(None, [("msr_relaxed", libxl_defbool),
> >> +                               ("force_x2apic", libxl_defbool)
> > 
> > This addition needs a new define in libxl.h to signal it's presence,
> > see LIBXL_HAVE_* defines in there.
> > 
> 
> Something like LIBXL_HAVE_FORCE_X2APIC ?

Yes, something like that.  Not sure we want to add X86 somewhere in
there, but X2APIC is already x86-specific so unlikely to have any
meaning for other arches.

> >>                                 ])),
> >>       # Alternate p2m is not bound to any architecture or guest type, as 
> >> it is
> >>       # supported by x86 HVM and ARM support is planned.
> >> diff --git a/tools/libs/light/libxl_x86.c b/tools/libs/light/libxl_x86.c
> >> index 60d4e8661c..2e0205d2a2 100644
> >> --- a/tools/libs/light/libxl_x86.c
> >> +++ b/tools/libs/light/libxl_x86.c
> >> @@ -26,6 +26,9 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
> >>       if (libxl_defbool_val(d_config->b_info.arch_x86.msr_relaxed))
> >>           config->arch.misc_flags |= XEN_X86_MSR_RELAXED;
> >>   
> >> +    if (libxl_defbool_val(d_config->b_info.arch_x86.force_x2apic))
> >> +        config->arch.misc_flags |= XEN_X86_FORCE_X2APIC;
> >> +
> >>       if (libxl_defbool_val(d_config->b_info.trap_unmapped_accesses)) {
> >>               LOG(ERROR, "trap_unmapped_accesses is not supported on 
> >> x86\n");
> >>               return ERROR_FAIL;
> >> @@ -818,6 +821,7 @@ int libxl__arch_domain_build_info_setdefault(libxl__gc 
> >> *gc,
> >>   {
> >>       libxl_defbool_setdefault(&b_info->acpi, true);
> >>       libxl_defbool_setdefault(&b_info->arch_x86.msr_relaxed, false);
> >> +    libxl_defbool_setdefault(&b_info->arch_x86.force_x2apic, false);
> >>       libxl_defbool_setdefault(&b_info->trap_unmapped_accesses, false);
> >>   
> >>       if (b_info->type == LIBXL_DOMAIN_TYPE_HVM) {
> >> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
> >> index af86d3186d..d84ab7c823 100644
> >> --- a/tools/xl/xl_parse.c
> >> +++ b/tools/xl/xl_parse.c
> >> @@ -3041,6 +3041,7 @@ skip_usbdev:
> >>                       "If it fixes an issue you are having please report 
> >> to "
> >>                       "[email protected].\n");
> >>   
> >> +    xlu_cfg_get_defbool(config, "force_x2apic", 
> >> &b_info->arch_x86.force_x2apic, 0);
> >>       xlu_cfg_get_defbool(config, "vpmu", &b_info->vpmu, 0);
> >>   
> >>       xlu_cfg_destroy(config);
> >> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> >> index 19fd86ce88..02f650a614 100644
> >> --- a/xen/arch/x86/domain.c
> >> +++ b/xen/arch/x86/domain.c
> >> @@ -704,7 +704,7 @@ int arch_sanitise_domain_config(struct 
> >> xen_domctl_createdomain *config)
> >>           return -EINVAL;
> >>       }
> >>   
> >> -    if ( config->arch.misc_flags & ~XEN_X86_MSR_RELAXED )
> >> +    if ( config->arch.misc_flags & ~(XEN_X86_MSR_RELAXED | 
> >> XEN_X86_FORCE_X2APIC) )
> >>       {
> >>           dprintk(XENLOG_INFO, "Invalid arch misc flags %#x\n",
> >>                   config->arch.misc_flags);
> >> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >> index 0c60faa39d..73cbac0f22 100644
> >> --- a/xen/arch/x86/hvm/hvm.c
> >> +++ b/xen/arch/x86/hvm/hvm.c
> >> @@ -616,6 +616,8 @@ int hvm_domain_initialise(struct domain *d,
> >>       INIT_LIST_HEAD(&d->arch.hvm.mmcfg_regions);
> >>       INIT_LIST_HEAD(&d->arch.hvm.msix_tables);
> >>   
> >> +    d->arch.hvm.force_x2apic = config->arch.misc_flags & 
> >> XEN_X86_FORCE_X2APIC;
> >> +
> >>       rc = create_perdomain_mapping(d, PERDOMAIN_VIRT_START, 0, NULL, 
> >> NULL);
> >>       if ( rc )
> >>           goto fail;
> >> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> >> index 993e972cd7..ae8df70d2e 100644
> >> --- a/xen/arch/x86/hvm/vlapic.c
> >> +++ b/xen/arch/x86/hvm/vlapic.c
> >> @@ -1116,6 +1116,20 @@ int guest_wrmsr_apic_base(struct vcpu *v, uint64_t 
> >> val)
> >>       if ( !has_vlapic(v->domain) )
> >>           return X86EMUL_EXCEPTION;
> >>   
> >> +    if ( has_force_x2apic(v->domain) )
> >> +    {
> >> +        /*
> >> +        * We implement the same semantics as 
> >> MSR_IA32_XAPIC_DISABLE_STATUS:
> >> +        * LEGACY_XAPIC_DISABLED which rejects any attempt at clearing
> >> +        * IA32_APIC_BASE.EXTD, thus forcing the LAPIC in x2APIC mode.
> >> +        */
> >> +        if ( !(val & APIC_BASE_EXTD) )
> >> +        {
> >> +            gprintk(XENLOG_WARNING, "tried to disable x2APIC while forced 
> >> on\n");
> >> +            return X86EMUL_EXCEPTION;
> >> +        }
> >> +    }
> >> +
> >>       /* Attempting to set reserved bits? */
> >>       if ( val & ~(APIC_BASE_ADDR_MASK | APIC_BASE_ENABLE | APIC_BASE_BSP |
> >>                    (cp->basic.x2apic ? APIC_BASE_EXTD : 0)) )
> >> @@ -1474,7 +1488,14 @@ void vlapic_reset(struct vlapic *vlapic)
> >>       if ( v->vcpu_id == 0 )
> >>           vlapic->hw.apic_base_msr |= APIC_BASE_BSP;
> >>   
> >> -    vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
> >> +    if ( has_force_x2apic(v->domain) )
> >> +    {
> >> +        vlapic->hw.apic_base_msr |= APIC_BASE_EXTD;
> >> +        set_x2apic_id(vlapic);
> >> +    }
> >> +    else
> >> +        vlapic_set_reg(vlapic, APIC_ID, (v->vcpu_id * 2) << 24);
> >> +
> >>       vlapic_do_init(vlapic);
> >>   }
> >>   
> >> diff --git a/xen/arch/x86/include/asm/domain.h 
> >> b/xen/arch/x86/include/asm/domain.h
> >> index 5df8c78253..771992d156 100644
> >> --- a/xen/arch/x86/include/asm/domain.h
> >> +++ b/xen/arch/x86/include/asm/domain.h
> >> @@ -509,6 +509,8 @@ struct arch_domain
> >>   #define has_pirq(d)        (!!((d)->arch.emulation_flags & 
> >> X86_EMU_USE_PIRQ))
> >>   #define has_vpci(d)        (!!((d)->arch.emulation_flags & X86_EMU_VPCI))
> >>   
> >> +#define has_force_x2apic(d) ((d)->arch.hvm.force_x2apic)
> >> +
> >>   #define gdt_ldt_pt_idx(v) \
> >>         ((v)->vcpu_id >> (PAGETABLE_ORDER - GDT_LDT_VCPU_SHIFT))
> >>   #define pv_gdt_ptes(v) \
> >> diff --git a/xen/arch/x86/include/asm/hvm/domain.h 
> >> b/xen/arch/x86/include/asm/hvm/domain.h
> >> index 333501d5f2..b56fa08b73 100644
> >> --- a/xen/arch/x86/include/asm/hvm/domain.h
> >> +++ b/xen/arch/x86/include/asm/hvm/domain.h
> >> @@ -108,6 +108,9 @@ struct hvm_domain {
> >>       /* Compatibility setting for a bug in x2APIC LDR */
> >>       bool bug_x2apic_ldr_vcpu_id;
> >>   
> >> +    /* LAPIC is forced in x2APIC mode */
> >> +    bool force_x2apic;
> > 
> > This should be a field in the vlapic struct, but seeing this I wonder
> > whether we want to virtualize MSR_IA32_XAPIC_DISABLE_STATUS MSR and
> > set the bit there.  This would also help with migrating the option, as
> > you could then migrate the "locked" status easily by just migrating
> > the contents of the MSR_IA32_XAPIC_DISABLE_STATUS MSR.
> > 
> 
> One issue with MSR_IA32_XAPIC_DISABLE_STATUS is that it is only 
> meaningful on Intel platforms (unless we also virtualize it on AMD ?), 
> and I haven't found a AMD-specific mecanism for exposing it.
> Most operating systems don't try to disable x2apic (unless told to do 
> it) if it is initially enabled ("enabled by firmware").

Yeah, I see the availability of MSR_IA32_XAPIC_DISABLE_STATUS is
exposed in MSR_ARCH_CAPABILITIES, which is only present on Intel
platforms.

I also haven't been able to find a way to expose the APIC is locked to
x2apic mode in any AMD manuals, which is a shame.

For Intel we should expose this when possible in
MSR_IA32_XAPIC_DISABLE_STATUS.  And we need to migrate the selection
in the stream as part of the APIC data.

Thanks, Roger.

Reply via email to