On Thu, Nov 29, 2018 at 3:27 AM Paolo Abeni <pab...@redhat.com> wrote: > > Hi, > > Thank you for the update! > > On Wed, 2018-11-28 at 18:50 -0500, Willem de Bruijn wrote: > > I did revert to the basic implementation using an extra ref > > for the function call, similar to TCP, as you suggested. > > > > On top of that as a separate optimization patch I have a > > variant that uses refcnt zero by replacing refcount_inc with > > refcount_set(.., refcount_read(..) + 1). Not very pretty. > > If the skb/uarg is not shared (no other threads can touch the refcnt) > before ip*_append_data() completes, how about something like the > following (incremental diff on top of patch 1/2, untested, uncompiled, > just to give the idea): > > The basic idea is using the same schema currently used for wmem > accounting: do the book-keeping inside the loop and set the atomic > reference counter only once at the end of the loop. > > WDYT?
Thanks for the suggestion. I think that a variant of this might be the best option indeed. The common case that we care about (ip_make_skb without fragmentation) only builds one skb, so we won't necessarily amortize many atomic ops. I also really would like to avoid the atomic op and branch in the non-error return path. But with uarg_refs we can indeed elide the refcount_inc on the first skb_zcopy_get and detect at skb_zerocopy_put_abort if the extra reference went unused and it has to put the ref itself. Let me code that up. Thanks!