On Fri, May 31, 2024 at 10:13:47AM -0700, Chen, Zide wrote:
> Date: Fri, 31 May 2024 10:13:47 -0700
> From: "Chen, Zide" <[email protected]>
> Subject: Re: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before
> x86_cpu_filter_features
>
> On 5/30/2024 11:30 PM, Zhao Liu wrote:
> > Hi Zide,
> >
> > On Fri, May 24, 2024 at 01:00:16PM -0700, Zide Chen wrote:
> >> Date: Fri, 24 May 2024 13:00:16 -0700
> >> From: Zide Chen <[email protected]>
> >> Subject: [PATCH V2 2/3] target/i386: call cpu_exec_realizefn before
> >> x86_cpu_filter_features
> >> X-Mailer: git-send-email 2.34.1
> >>
> >> cpu_exec_realizefn which calls the accel-specific realizefn may expand
> >> features. e.g., some accel-specific options may require extra features
> >> to be enabled, and it's appropriate to expand these features in accel-
> >> specific realizefn.
> >>
> >> One such example is the cpu-pm option, which may add CPUID_EXT_MONITOR.
> >>
> >> Thus, call cpu_exec_realizefn before x86_cpu_filter_features to ensure
> >> that it won't expose features not supported by the host.
> >>
> >> Fixes: 662175b91ff2 ("i386: reorder call to cpu_exec_realizefn")
> >> Suggested-by: Xiaoyao Li <[email protected]>
> >> Signed-off-by: Zide Chen <[email protected]>
> >> ---
> >> target/i386/cpu.c | 24 ++++++++++++------------
> >> target/i386/kvm/kvm-cpu.c | 1 -
> >> 2 files changed, 12 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> >> index bc2dceb647fa..a1c1c785bd2f 100644
> >> --- a/target/i386/cpu.c
> >> +++ b/target/i386/cpu.c
> >> @@ -7604,6 +7604,18 @@ static void x86_cpu_realizefn(DeviceState *dev,
> >> Error **errp)
> >> }
> >> }
> >>
> >> + /*
> >> + * note: the call to the framework needs to happen after feature
> >> expansion,
> >> + * but before the checks/modifications to ucode_rev, mwait, phys_bits.
> >> + * These may be set by the accel-specific code,
> >> + * and the results are subsequently checked / assumed in this
> >> function.
> >> + */
> >> + cpu_exec_realizefn(cs, &local_err);
> >> + if (local_err != NULL) {
> >> + error_propagate(errp, local_err);
> >> + return;
> >> + }
> >> +
> >> x86_cpu_filter_features(cpu, cpu->check_cpuid || cpu->enforce_cpuid);
> >
> > For your case, which sets cpu-pm=on via overcommit, then
> > x86_cpu_filter_features() will complain that mwait is not supported.
> >
> > Such warning is not necessary, because the purpose of overcommit (from
> > code) is only to support mwait when possible, not to commit to support
> > mwait in Guest.
> >
> > Additionally, I understand x86_cpu_filter_features() is primarily
> > intended to filter features configured by the user,
>
> Yes, that's why this patches intends to let x86_cpu_filter_features()
> filter out the MWAIT bit which is set from the overcommit option.
HMM, but in fact x86_cpu_filter_features() has already checked the MWAIT
bit set by "-overcommit cpu-pm=on". ;-)
(Pls correct me if I'm wrong) Revisiting what cpu-pm did to MWAIT:
* Firstly, it set MWAIT bit in x86_cpu_expand_features():
x86_cpu_expand_features()
-> x86_cpu_get_supported_feature_word()
-> kvm_arch_get_supported_cpuid()
This MWAIT is based on Host's MWAIT capability. This MWAIT enablement
is fine for next x86_cpu_filter_features() and x86_cpu_filter_features()
is working correctly here!
* Then, MWAIT was secondly set in host_cpu_enable_cpu_pm() regardless
neither Host's support or previous MWAIT enablement result. This is
the root cause of your issue.
Therefore, we should make cpu-pm honor his first MWAIT enablement result
instead of repeatly and unconditionally setting the MWAIT bit again in
host_cpu_enable_cpu_pm().
Additionally, I think the code in x86_cpu_realizefn():
cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE;
has the similar issue because it also should check MWAIT feature bit.
Further, it may be possible to remove cpu->mwait: just check the MWAIT
bit in leaf 5 of cpu_x86_cpuid(), and if MWAIT is present, use host's
mwait info plus CPUID_MWAIT_EMX | CPUID_MWAIT_IBE.
> > and the changes of
> > CPUID after x86_cpu_filter_features() should by default be regarded like
> > "QEMU knows what it is doing".
>
> Sure, we can add feature bits after x86_cpu_filter_features(), but I
> think moving cpu_exec_realizefn() before x86_cpu_filter_features() is
> more generic, and actually this is what QEMU did before commit 662175b91ff2.
>
> - Less redundant code. Specifically, no need to call
> x86_cpu_get_supported_feature_word() again.
> - Potentially there could be other features could be added from the
> accel-specific realizefn, kvm_cpu_realizefn() for example. And these
> features need to be checked against the host availability.
Mainly I don't think this reorder is a direct fix for the problem (I
just analyse it above), also in your case x86_cpu_filter_features() will
print a WARNING when QEMU boots, which I don't think is cpu-pm's intention.