On Fri, Aug 30, 2024 at 10:02:40AM -0300, Fabiano Rosas wrote:
> >>> @@ -397,20 +404,16 @@ bool multifd_send(MultiFDSendData **send_data)
> >>>
> >>> p = &multifd_send_state->params[i];
> >>> /*
> >>> - * Lockless read to p->pending_job is safe, because only multifd
> >>> - * sender thread can clear it.
> >>> + * Lockless RMW on p->pending_job_preparing is safe, because
> >>> only multifd
> >>> + * sender thread can clear it after it had seen p->pending_job
> >>> being set.
> >>> + *
> >>> + * Pairs with qatomic_store_release() in multifd_send_thread().
> >>> */
> >>> - if (qatomic_read(&p->pending_job) == false) {
> >>> + if (qatomic_cmpxchg(&p->pending_job_preparing, false, true) ==
> >>> false) {
> >>
> >> What's the motivation for this change? It would be better to have it in
> >> a separate patch with a proper justification.
> >
> > The original RFC patch set used dedicated device state multifd channels.
> >
> > Peter and other people wanted this functionality removed, however this
> > caused
> > a performance (downtime) regression.
> >
> > One of the things that seemed to help mitigate this regression was making
> > the multifd channel selection more fair via this change.
> >
> > But I can split out it to a separate commit in the next patch set version
> > and
> > then see what performance improvement it currently brings.
>
> Yes, better to have it separate if anything for documentation of the
> rationale.
And when drafting that patch, please add a comment explaining the field.
Currently it's missing:
/*
* The sender thread has work to do if either of below boolean is set.
*
* @pending_job: a job is pending
* @pending_sync: a sync request is pending
*
* For both of these fields, they're only set by the requesters, and
* cleared by the multifd sender threads.
*/
bool pending_job;
bool pending_job_preparing;
bool pending_sync;
--
Peter Xu