>-----Original Message-----
>From: Steven Sistare <[email protected]>
>Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
>"query-balloon" after CPR transfer
>
>On 10/8/2025 6:22 AM, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Steven Sistare <[email protected]>
>>> Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
>>> "query-balloon" after CPR transfer
>>>
>>> On 9/30/2025 2:00 AM, Duan, Zhenzhong wrote:
>>>>> -----Original Message-----
>>>>> From: Steven Sistare <[email protected]>
>>>>> Subject: Re: [PATCH v2 6/6] accel/kvm: Fix SIGSEGV when execute
>>>>> "query-balloon" after CPR transfer
>>>>>
>>>>> On 9/28/2025 4:54 AM, Zhenzhong Duan wrote:
>>>>>> After CPR transfer, source QEMU closes kvm fd and sets kvm_state to
>>>>> NULL,
>>>>>> "query-balloon" will check kvm_state->sync_mmu and trigger NULL
>>> pointer
>>>>>> reference.
>>>>>>
>>>>>> We don't need to NULL kvm_state as all states in kvm_state aren't
>>> released
>>>>>> actually. Just closing kvm fd is enough so we could still query states
>>>>>> through "query_*" qmp command.
>>>>>
>>>>> IMO this does not make sense.  Much of the state in kvm_state was
>>> derived
>>>> >from ioctl's on the descriptors, and closing them invalidates it.  Asking
>>>>> historical questions about what used to be makes no sense.
>>>>
>>>> You also have your valid point.
>>>>
>>>>>
>>>>> Clearing kvm_state and setting kvm_allowed=false would be a safer fix.
>>>
>>> Setting kvm_allowed=false causes kvm_enabled() to return false which
>should
>>> prevent kvm_state from being dereferenced anywhere:
>>>
>>>      #define kvm_enabled()           (kvm_allowed)
>>>
>>>    Eg for the balloon:
>>>
>>>      static bool have_balloon(Error **errp)
>>>      {
>>>          if (kvm_enabled() && !kvm_has_sync_mmu()) {
>>
>> OK, will do, clearing kvm_allowed is a good idea.
>>
>> Currently there are two qmp commands "query-balloon" and
>"query-cpu-definitions"
>> causing SIGSEGV after CPR-transfer, clearing kvm_allowed helps for both.
>> But clearing both kvm_allowed and kvm_state cause SIGSEGV on
>"query-cpu-definitions".
>>
>> I'll send a patch to clearing only kvm_allowed if you are fine with it.
>
>I still don't love the idea.  kvm_state is no longer valid.

OK, what about another idea to not call vfio_cpr_kvm_close_notifier() for 
cpr-transfer, like below:

--- a/hw/vfio/cpr.c
+++ b/hw/vfio/cpr.c
@@ -197,7 +197,7 @@ void vfio_cpr_add_kvm_notifier(void)
     if (!kvm_close_notifier.notify) {
         migration_add_notifier_modes(&kvm_close_notifier,
                                      vfio_cpr_kvm_close_notifier,
-                                     MIG_MODE_CPR_TRANSFER, MIG_MODE_CPR_EXEC,
+                                     MIG_MODE_CPR_EXEC,
                                      -1);
     }
 }

The close_kvm_after_cpr is only for cpr-exec, with this change, we are in same 
situation as live migration and can run all qmp commands same as after live 
migration.

>
>It sounds like "query-cpu-definitions" is missing a check for kvm_enabled().

Yes.

Thanks
Zhenzhong

Reply via email to