On 9/30/2025 2:00 AM, Duan, Zhenzhong wrote:
-----Original Message-----
From: Steven Sistare <[email protected]>
Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
"query-balloon" after CPR transfer

On 9/28/2025 4:54 AM, Zhenzhong Duan wrote:
After CPR transfer, source QEMU closes kvm fd and sets kvm_state to
NULL,
"query-balloon" will check kvm_state->sync_mmu and trigger NULL pointer
reference.

We don't need to NULL kvm_state as all states in kvm_state aren't released
actually. Just closing kvm fd is enough so we could still query states
through "query_*" qmp command.

IMO this does not make sense.  Much of the state in kvm_state was derived
from ioctl's on the descriptors, and closing them invalidates it.  Asking
historical questions about what used to be makes no sense.

You also have your valid point.


Clearing kvm_state and setting kvm_allowed=false would be a safer fix.

Setting kvm_allowed=false causes kvm_enabled() to return false which should
prevent kvm_state from being dereferenced anywhere:

    #define kvm_enabled()           (kvm_allowed)

  Eg for the balloon:

    static bool have_balloon(Error **errp)
    {
        if (kvm_enabled() && !kvm_has_sync_mmu()) {

- Steve

But clearing kvm_state will make some qmp commands which dereferencing
kvm_state trigger SIGSEGV. We can expect failure on those qmp, but SIGSEGV
is worse.

E.g., {"execute": "query-cpu-definitions"}, below error print with v2 but 
SIGSEGV with v1:

KVM get MSR (index=0x10a) feature failed, Bad file descriptor


I also see inconsistence on "query-balloon" result with v1 patch, before 
cpr-transfer:

{"error": {"class": "DeviceNotActive", "desc": "No balloon device has been 
activated"}}

after transfer:

{"error": {"class": "KVMMissingCap", "desc": "Using KVM without synchronous MMU, 
balloon unavailable"}}

It's confusing there is no synchronous MMU support but we do have it.

Thanks
Zhenzhong


Reply via email to