On Tue, Apr 27, 2021 at 04:09:48PM +0800, Like Xu wrote: > The last branch recording (LBR) is a performance monitor unit (PMU) > feature on Intel processors that records a running trace of the most > recent branches taken by the processor in the LBR stack. The QEMU > could configure whether it's enabled or not for each guest via CLI. > > The LBR feature would be enabled on the guest if: > - the KVM is enabled and the PMU is enabled and, > - the msr-based-feature IA32_PERF_CAPABILITIES is supporterd on KVM and, > - the supported returned value for lbr_fmt from this msr is not zero and, > - the requested guest vcpu model does support FEAT_1_ECX.CPUID_EXT_PDCM, > - the configured lbr-fmt value is the same as the host lbr_fmt value > OR use the QEMU option "-cpu host,migratable=no".
I don't understand why "migratable" matters here. "migratable" is just a convenience property to get better defaults when using "-cpu host". I don't know why it would change the lbr-fmt validation rules. > > Signed-off-by: Like Xu <[email protected]> > --- A changelog explaining what you changed since v1 would have been useful here. > target/i386/cpu.c | 34 ++++++++++++++++++++++++++++++++++ > target/i386/cpu.h | 10 ++++++++++ > target/i386/kvm/kvm.c | 10 ++++++++-- > 3 files changed, 52 insertions(+), 2 deletions(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index ad99cad0e7..9c8e54aa6f 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -6623,6 +6623,10 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool > verbose) > } > > for (w = 0; w < FEATURE_WORDS; w++) { > + if (w == FEAT_PERF_CAPABILITIES) { > + continue; > + } > + Why exactly is this necessary? I expected to be completely OK to call mark_unavailable_features() multiple times for the same FeatureWord. If there's a reason why this is necessary, I suggest adding a comment explaining why. > uint64_t host_feat = > x86_cpu_get_supported_feature_word(w, false); > uint64_t requested_features = env->features[w]; > @@ -6630,6 +6634,27 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool > verbose) > mark_unavailable_features(cpu, w, unavailable_features, prefix); > } > > + uint64_t host_perf_cap = > + x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false); > + if (!cpu->lbr_fmt && !cpu->migratable) { > + cpu->lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT; "migratable=no" is not a request to override explicit user settings. This is why we have the ~env->user_features masking inside x86_cpu_expand_features() when initializing env->features[]. In either case, I don't understand why you need the lines above. "migratable=no" should already trigger the x86_cpu_get_supported_feature_word() loop inside x86_cpu_expand_features(), and it should initialize env->features[FEAT_PERF_CAPABILITIES] with the host value. Isn't that code working for you? > + if (cpu->lbr_fmt) { > + info_report("vPMU: The value of lbr-fmt has been adjusted " > + "to 0x%lx and guest LBR is enabled.", > + host_perf_cap & PERF_CAP_LBR_FMT); >From your other message: (I'm assuming your examples are for a lbr-fmt=5 host) > "-cpu host,migratable=no" --> "Enable guest LBR and show warning" Enabling guest LBR in this case is 100% OK, isn't it? I don't think you need to show a warning. > "-cpu host,migratable=no,lbr-fmt=0" --> "Enable guest LBR and show warning" Why? In this case, we should do what the user asked for whenever possible, and the user is explicitly asking lbr-fmt to be 0. > "-cpu host,migratable=no,lbr-fmt=5" --> "Enable guest LBR" Looks OK. > "-cpu host,migratable=no,lbr-fmt=6" --> "Disable guest LBR and show warning" Makes sense to me[1]. > + } > + } else { > + uint64_t requested_lbr_fmt = cpu->lbr_fmt & PERF_CAP_LBR_FMT; > + if (requested_lbr_fmt && kvm_enabled()) { >From your other message: > "-cpu host,lbr-fmt=0" --> "Disable guest LBR" Makes sense to me. I understand this as a confirmation that it's OK to have a guest/host mismatch if guest LBR=0. > "-cpu host,lbr-fmt=5" --> "Enable guest LBR" Makes sense to me. > "-cpu host,lbr-fmt=6" --> "Disable guest LBR and show warning" Makes sense to me[1]. [1] As long as "show warning" becomes "fatal error" if enforce=1. mark_unavailable_features() should make sure this happens. Or maybe we should make this an error? It would be even better. The example code below makes it an error. > + if (requested_lbr_fmt != (host_perf_cap & PERF_CAP_LBR_FMT)) { > + cpu->lbr_fmt = 0; > + warn_report("vPMU: The supported lbr-fmt value on the host " > + "is 0x%lx and guest LBR is disabled.", > + host_perf_cap & PERF_CAP_LBR_FMT); > + } > + } > + } > + > if ((env->features[FEAT_7_0_EBX] & CPUID_7_0_EBX_INTEL_PT) && > kvm_enabled()) { > KVMState *s = CPU(cpu)->kvm_state; > @@ -6734,6 +6759,14 @@ static void x86_cpu_realizefn(DeviceState *dev, Error > **errp) > } > } > > + if (cpu->lbr_fmt) { > + if (!cpu->enable_pmu) { > + error_setg(errp, "LBR is unsupported since guest PMU is > disabled."); > + return; > + } > + env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt; This doesn't seem right, as we should still do what the user asked for if "lbr-fmt=0" is used. You need to differentiate "lbr-fmt=0" from "lbr-fmt not set" somehow. I suggest initializing lbr_fmt to 0xFF by default, so you can override env->features[FEAT_PERF_CAPABILITIES] when lbr_fmt != 0xFF (even if lbr_fmt=0). Something like this: #define LBR_FMT_UNSET 0xff ... DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, LBR_FMT_UNSET) ... void x86_cpu_realizefn(...) { ... if (cpu->lbr_fmt != LBR_FMT_UNSET) { if ((cpu->lbr_fmt & LBR_FMT_FMT) != cpu->lbr_fmt) { error_setg(errp, "invalid lbr-fmt" ...); return; } env->features[FEAT_PERF_CAPABILITIES] &= ~PERF_CAP_LBR_FMT; env->features[FEAT_PERF_CAPABILITIES] |= cpu->lbr_fmt; } /* If lbr_fmt == LBR_FMT_UNSET, the default value of env->features[] * will be used as is (and it may depend on the "migratable" flag) */ ... /* * We can always validate env->features[FEAT_PERF_CAPABILITIES], * no matter how it was initialized: */ uint64_t requested_lbr_fmt = env->features[FEAT_PERF_CAPABILITIES] & PERF_CAP_LBR_FMT; if (requested_lbr_fmt && kvm_enabled()) { /* Maybe this code will work out of the box for all * accelerators, but checking kvm_enabled() is safer. */ uint64_t host_perf_cap = x86_cpu_get_supported_feature_word(FEAT_PERF_CAPABILITIES, false); uint64_t host_lbr_fmt = host_perf_cap & PERF_CAP_LBR_FMT; if (!cpu->enable_pmu) { error_setg(errp, "LBR is unsupported without pmu=on"); return; } if (requested_lbr_fmt != host_lbr_fmt)) { /* An error is better than a warning */ error_setg(errp, "lbr-fmt mismatch" ...); /* probably a good idea to include requested_lbr_fmt * and host_lbr_fmt in the error message */ return; } } ... } > + } > + > /* mwait extended info: needed for Core compatibility */ > /* We always wake on interrupt even if host does not have the capability > */ > cpu->mwait.ecx |= CPUID_MWAIT_EMX | CPUID_MWAIT_IBE; > @@ -7300,6 +7333,7 @@ static Property x86_cpu_properties[] = { > #endif > DEFINE_PROP_INT32("node-id", X86CPU, node_id, CPU_UNSET_NUMA_NODE_ID), > DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false), > + DEFINE_PROP_UINT8("lbr-fmt", X86CPU, lbr_fmt, 0), > > DEFINE_PROP_UINT32("hv-spinlocks", X86CPU, hyperv_spinlock_attempts, > HYPERV_SPINLOCK_NEVER_NOTIFY), > diff --git a/target/i386/cpu.h b/target/i386/cpu.h > index 570f916878..b12c879fc4 100644 > --- a/target/i386/cpu.h > +++ b/target/i386/cpu.h > @@ -354,6 +354,7 @@ typedef enum X86Seg { > #define ARCH_CAP_TSX_CTRL_MSR (1<<7) > > #define MSR_IA32_PERF_CAPABILITIES 0x345 > +#define PERF_CAP_LBR_FMT 0x3f > > #define MSR_IA32_TSX_CTRL 0x122 > #define MSR_IA32_TSCDEADLINE 0x6e0 > @@ -1726,6 +1727,15 @@ struct X86CPU { > */ > bool enable_pmu; > > + /* > + * Configure LBR_FMT bits on IA32_PERF_CAPABILITIES MSR. > + * This can't be enabled by default yet because it doesn't have > + * ABI stability guarantees, as it is only allowed to pass all > + * LBR_FMT bits returned by kvm_arch_get_supported_msr_feature() > + * (that depends on host CPU and kernel capabilities) to the guest. > + */ > + uint8_t lbr_fmt; > + > /* LMCE support can be enabled/disabled via cpu option 'lmce=on/off'. It > is > * disabled by default to avoid breaking migration between QEMU with > * different LMCE configurations. > diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c > index 7fe9f52710..aa926984ae 100644 > --- a/target/i386/kvm/kvm.c > +++ b/target/i386/kvm/kvm.c > @@ -2732,8 +2732,14 @@ static void kvm_msr_entry_add_perf(X86CPU *cpu, > FeatureWordArray f) > MSR_IA32_PERF_CAPABILITIES); > > if (kvm_perf_cap) { > - kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, > - kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]); > + kvm_perf_cap = (cpu->migratable) ? > + (kvm_perf_cap & f[FEAT_PERF_CAPABILITIES]) : kvm_perf_cap; I don't understand why you are checking cpu->migratable here. The CPU code should ensure f[FEAT_PERF_CAPABILITIES] is initialized correctly before calling kvm_arch_init_vcpu(). > + > + if (!cpu->lbr_fmt) { > + kvm_perf_cap &= ~PERF_CAP_LBR_FMT; > + } Likewise: this should be done by the CPU initialization code before kvm_arch_init_vcpu() gets called. The existing code looks weird here: what's the purpose of the kvm_arch_get_supported_msr_feature() call in this function? env->features[] is supposed to reflect what the guest actually sees. x86_cpu_realizefn()/x86_cpu_filter_features() is supposed to ensure that before calling kvm_arch_init_vcpu(). If there's a mismatch between env->features and what the guest sees, we have a problem. If you want to be 100% sure, maybe you can add an assert() here. But if the function is receiving invalid input it's too late to fix the value. > + > + kvm_msr_entry_add(cpu, MSR_IA32_PERF_CAPABILITIES, kvm_perf_cap); > } > } > > -- > 2.30.2 > -- Eduardo
