On Mon, Nov 26, 2018 at 11:32 AM Paolo Abeni <pab...@redhat.com> wrote: > > Hi, > > Sorry for the long delay... > > On Mon, 2018-11-26 at 10:29 -0500, Willem de Bruijn wrote: > > @@ -1109,6 +1128,7 @@ static int __ip_append_data(struct sock *sk, > > error_efault: > > err = -EFAULT; > > error: > > + sock_zerocopy_put_abort(uarg); > > cork->length -= length; > > IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTDISCARDS); > > refcount_add(wmem_alloc_delta, &sk->sk_wmem_alloc); > > Out of sheer ignorance on my side, don't we have a bad reference > accounting if e.g.: > > - uarg is attached to multiple skbs, each holding a ref, > - there is a failure on 'getfrag()' > > Such failure will release 2 references (1 kfree_skb(), and another in > the above sock_zerocopy_put_abort(), as the count is still positive).
Thanks Paolo. Indeed, I had not anticipated the case of partial failure, where more than one skb is allocated (or converted to zerocopy) and only the last one is freed on error inside __ip_append_data. The callers of this function do flush the queue of the other skbs on error, but only after the call to sock_zerocopy_put_abort. sock_zerocopy_put_abort depends on total rollback to revert the sk_zckey increment and suppress the completion notification (which must not happen on return with error). I don't immediately have a fix. Need to think about this some more..