Re: [PATCH 1/3] vhost_task: KVM: Don't wake KVM x86's recovery thread if vhost task was killed

Michael S. Tsirkin Tue, 26 Aug 2025 07:58:55 -0700

On Tue, Aug 26, 2025 at 07:03:33AM -0700, Sean Christopherson wrote:
> On Tue, Aug 26, 2025, Michael S. Tsirkin wrote:
> > On Mon, Aug 25, 2025 at 05:40:09PM -0700, Sean Christopherson wrote:
> > > Provide an API in vhost task instead of forcing KVM to solve the problem,
> > > as KVM would literally just add an equivalent to VHOST_TASK_FLAGS_KILLED,
> > > along with a new lock to protect said flag.  In general, forcing simple
> > > usage of vhost task to care about signals _and_ take non-trivial action to
> > > do the right thing isn't developer friendly, and is likely to lead to
> > > similar bugs in the future.
> > > 
> > > Debugged-by: Sebastian Andrzej Siewior <[email protected]>
> > > Link: https://lore.kernel.org/all/[email protected]
> > > Link: https://lore.kernel.org/all/[email protected]
> > > Suggested-by: Sebastian Andrzej Siewior <[email protected]>
> > > Fixes: d96c77bd4eeb ("KVM: x86: switch hugepage recovery thread to 
> > > vhost_task")
> > > Cc: [email protected]
> > > Signed-off-by: Sean Christopherson <[email protected]>
> > 
> > OK but I dislike the API.
> 
> FWIW, I don't love it either.
> 
> > Default APIs should be safe. So vhost_task_wake_safe should be
> > vhost_task_wake
> > 
> > This also reduces the changes to kvm.
> > 
> > 
> > It does not look like we need the "unsafe" variant, so pls drop it.
> 
> vhost_vq_work_queue() calls
> 
>   vhost_worker_queue()
>   |
>   -> worker->ops->wakeup(worker)
>      |
>      -> vhost_task_wakeup()
>         |
>         -> vhost_task_wake()
> 
> while holding RCU and so can't sleep.
> 
>       rcu_read_lock();
>       worker = rcu_dereference(vq->worker);
>       if (worker) {
>               queued = true;
>               vhost_worker_queue(worker, work);
>       }
>       rcu_read_unlock();
> 
> And the call from __vhost_worker_flush() is done while holding a 
> vhost_worker.mutex.
> That's probably ok?  But there are many paths that lead to 
> __vhost_worker_flush(),
> which makes it difficult to audit all flows.  So even if there is an easy 
> change
> for the RCU conflict, I wouldn't be comfortable adding a mutex_lock() to so 
> many
> flows in a patch that needs to go to stable@.
> 
> > If we do need it, it should be called __vhost_task_wake.
> 
> I initially had that, but didn't like that vhost_task_wake() wouldn't call
> __vhost_task_wake(), i.e. wouldn't follow the semi-standard pattern of the
> no-underscores function being a wrapper for the double-underscores function.


Eh. that's not really a standard. the standard is that __ is an unsafe
variant.

> I'm definitely not opposed to that though (or any other naming options).  Sans
> comments, this was my other idea for names:
> 
> 
> static void ____vhost_task_wake(struct vhost_task *vtsk)

That's way too many __. Just vhost_task_wake_up_process will do.

> {
>       wake_up_process(vtsk->task);
> }



Pls add docs explaining the usage of __vhost_task_wake
and vhost_task_wake respectively.

> void __vhost_task_wake(struct vhost_task *vtsk)
> {
>       WARN_ON_ONCE(!vtsk->handle_sigkill);
> 
>       if (WARN_ON_ONCE(test_bit(VHOST_TASK_FLAGS_KILLED, &vtsk->flags)))
>               return;

Add comments here please explaining why we warn.

>       ____vhost_task_wake(vtsk);
> }
> EXPORT_SYMBOL_GPL(__vhost_task_wake);



> void vhost_task_wake(struct vhost_task *vtsk)


> {
>       guard(mutex)(&vtsk->exit_mutex);
> 
>       if (WARN_ON_ONCE(test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)))

Add comments here please explaining why we warn.

>               return;
> 
>       if (test_bit(VHOST_TASK_FLAGS_KILLED, &vtsk->flags))
>               return;
> 
>       ____vhost_task_wake(vtsk);
> }
> EXPORT_SYMBOL_GPL(vhost_task_wake);

Re: [PATCH 1/3] vhost_task: KVM: Don't wake KVM x86's recovery thread if vhost task was killed

Reply via email to