Hi, Thomas, On Wed, Sep 9, 2020 at 3:20 PM Thomas Huth <th...@redhat.com> wrote: > > On 09/09/2020 04.57, chen huacai wrote: > > Hi, all, > > > > On Wed, Sep 9, 2020 at 1:25 AM Thomas Huth <th...@redhat.com> wrote: > >> > >> On 24/08/2020 10.11, Huacai Chen wrote: > >>> MIPS has two types of KVM: TE & VZ, and TE is the default type. Now, > >>> libvirt uses a null-machine to detect the kvm capability. In the MIPS > >>> case, it will return "KVM not supported" on a VZ platform by default. > >>> So, add the kvm_type() hook to the null-machine. > >>> > >>> This seems not a very good solution, but I cannot do it better now. > >> > >> This is still ugly. Why do the other architectures do not have the > >> same problem? Let's see... in kvm-all.c, we have: > >> > >> int type = 0; > >> [...] > >> kvm_type = qemu_opt_get(qemu_get_machine_opts(), "kvm-type"); > >> if (mc->kvm_type) { > >> type = mc->kvm_type(ms, kvm_type); > >> } else if (kvm_type) { > >> ret = -EINVAL; > >> fprintf(stderr, "Invalid argument kvm-type=%s\n", kvm_type); > >> goto err; > >> } > >> > >> do { > >> ret = kvm_ioctl(s, KVM_CREATE_VM, type); > >> } while (ret == -EINTR); > >> > >> Thus the KVM_CREATE_VM ioctl is likely called with type = 0 in this > >> case (i.e. when libvirt probes with the "null"-machine). > >> > >> Now let's have a look at the kernel. The "type" parameter is passed > >> there to the architecture specific function kvm_arch_init_vm(). > >> For powerpc, this looks like: > >> > >> if (type == 0) { > >> if (kvmppc_hv_ops) > >> kvm_ops = kvmppc_hv_ops; > >> else > >> kvm_ops = kvmppc_pr_ops; > >> if (!kvm_ops) > >> goto err_out; > >> } else if (type == KVM_VM_PPC_HV) { > >> if (!kvmppc_hv_ops) > >> goto err_out; > >> kvm_ops = kvmppc_hv_ops; > >> } else if (type == KVM_VM_PPC_PR) { > >> if (!kvmppc_pr_ops) > >> goto err_out; > >> kvm_ops = kvmppc_pr_ops; > >> } else > >> goto err_out; > >> > >> That means for type == 0, it automatically detects the best > >> kvm-type. > >> > >> For mips, this function looks like this: > >> > >> switch (type) { > >> #ifdef CONFIG_KVM_MIPS_VZ > >> case KVM_VM_MIPS_VZ: > >> #else > >> case KVM_VM_MIPS_TE: > >> #endif > >> break; > >> default: > >> /* Unsupported KVM type */ > >> return -EINVAL; > >> }; > >> > >> That means, for type == 0, it returns -EINVAL here! > >> > >> Looking at the API docu in Documentation/virt/kvm/api.rst > >> the description of the type parameter is quite sparse, but it > >> says: > >> > >> "You probably want to use 0 as machine type." > >> > >> So I think this is a bug in the implementation of KVM in the > >> mips kernel code. The kvm_arch_init_vm() in the mips code should > >> do the same as on powerpc, and use the best available KVM type > >> there instead of returning EINVAL. Once that is fixed there, > >> you don't need this patch here for QEMU anymore. > > Yes, PPC use a good method, because it can use 0 as "automatic" > > #define KVM_VM_PPC_HV 1 > > #define KVM_VM_PPC_PR 2 > > > > Unfortunately, MIPS cannot do like this because it define 0 as "TE": > > #define KVM_VM_MIPS_TE 0 > > #define KVM_VM_MIPS_VZ 1 > > > > So, it cannot be solved in kernel side, unless changing the definition > > of TE/VZ, but I think changing their definition is also unacceptable. > > Ouch, ok, now I understood the problem. That sounds like a really bad > decision on the kernel side. > > But I think you could at least try to get it fixed on the kernel side: > Propose a patch to rename KVM_VM_MIPS_TE to KVM_VM_MIPS_DEFAULT and use > 2 as new value for TE. The code that handles the default 0 case should > then prefer TE over VZ, so that old userspace applications that provide > 0 will continue to work as they did before, so I hope that the change is > acceptable by the kernel folks. You should add a remark to the patch > description that 0 is the established default value for probing in the > QEMU/libvirt stack and that your patch is required on the kernel side to > avoid ugly hacks in QEMU userspace code. > > If they still reject your patch, I guess we have to bite the bullet and > add your patch here... OK, let me try.
Huacai > > Thanks, > Thomas >