On Tue, 1 Jul 2025 16:01:21 -0400
Konrad Rzeszutek Wilk <konrad.w...@oracle.com> wrote:

> On Tue, Jul 01, 2025 at 03:05:00PM +0200, Igor Mammedov wrote:
> > On Tue, 1 Jul 2025 20:36:43 +0800
> > Zhao Liu <zhao1....@intel.com> wrote:
> >   
> > > On Tue, Jul 01, 2025 at 07:12:44PM +0800, Xiaoyao Li wrote:  
> > > > Date: Tue, 1 Jul 2025 19:12:44 +0800
> > > > From: Xiaoyao Li <xiaoyao...@intel.com>
> > > > Subject: Re: [PATCH] i386/cpu: ARCH_CAPABILITIES should not be 
> > > > advertised
> > > >  on AMD
> > > > 
> > > > On 7/1/2025 6:26 PM, Zhao Liu wrote:    
> > > > > > unless it was explicitly requested by the user.    
> > > > > But this could still break Windows, just like issue #3001, which 
> > > > > enables
> > > > > arch-capabilities for EPYC-Genoa. This fact shows that even explicitly
> > > > > turning on arch-capabilities in AMD Guest and utilizing KVM's emulated
> > > > > value would even break something.
> > > > > 
> > > > > So even for named CPUs, arch-capabilities=on doesn't reflect the fact
> > > > > that it is purely emulated, and is (maybe?) harmful.    
> > > > 
> > > > It is because Windows adds wrong code. So it breaks itself and it's 
> > > > just the
> > > > regression of Windows.    
> > > 
> > > Could you please tell me what the Windows's wrong code is? And what's
> > > wrong when someone is following the hardware spec?  
> > 
> > the reason is that it's reserved on AMD hence software shouldn't even try
> > to use it or make any decisions based on that.
> > 
> > 
> > PS:
> > on contrary, doing such ad-hoc 'cleanups' for the sake of misbehaving
> > guest would actually complicate QEMU for no big reason.  
> 
> The guest is not misbehaving. It is following the spec.

that's not how I read spec:

"
AMD64 Architecture Programmer’s Manual Volume 3: General-Purpose and System 
Instructions
24594—Rev. 3.36—March 2024
...
Appendix E Obtaining Processor Information Via the CPUID Instruction
...
All bit positions that are not defined as fields are
reserved. The value of bits within reserved ranges cannot be relied upon to be 
zero.
Software must mask off all reserved bits in the return value prior to making 
any value comparisons of represented
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
information.
...
E.3.6 Function 7h—Structured Extended Feature Identifiers
...
The value returned in EDX is undefined and is reserved.
"

what actually happens is guest side being lazy and blindly following CPUID.


> > Also
> > KVM does do have plenty of such code, and it's not actively preventing 
> > guests from using it.
> > Given that KVM is not welcoming such change, I think QEMU shouldn't do that 
> > either.  
> 
> Because KVM maintainer does not want to touch the guest ABI. He agrees
> this is a bug.
one can say both guest and hypervisor are to blame,
  1st is not masking reserved bits
  2nd provides 'hybrid' cpu that doesn't exists in real world,
  but then 'host' cpu model has never been the exact match for physical cpu.

what I dislike is ad-hoc fixups in generic code, 
if consensus were to implement _out of spec_ fixup for already fixed issue in 
Windows,
it should be better be done in host cpumodel code.

Or even better a single KVM optin feature 
'do_not_advertise_features_not_supported_by_host_cpu',
and then QEMU could use that for disabling all nonsense in one go.
Plus all of that won't be breaking KVM ABI nor qemu had to add fixups for this 
and that feature.

After some time when old machine types are deprecated/gone, KVM could make it 
default and eventually
remove advertising 'fake' features.

PS:
On QEMU side we usually tolerant to such fixups if it's not fixable on guest 
side.
but that's not the case here.


Reply via email to