> -----Original Message-----
> From: netdev-ow...@vger.kernel.org [mailto:netdev-
> ow...@vger.kernel.org] On Behalf Of David Miller
> Sent: Wednesday, February 14, 2018 21:27
> To: Jon Maloy <jon.ma...@ericsson.com>
> Cc: netdev@vger.kernel.org; Mohan Krishna Ghanta Krishnamurthy
> <mohan.krishna.ghanta.krishnamur...@ericsson.com>; Tung Quang Nguyen
> <tung.q.ngu...@dektech.com.au>; Hoang Huu Le
> <hoang.h...@dektech.com.au>; Canh Duc Luu
> <canh.d....@dektech.com.au>; Ying Xue <ying....@windriver.com>; tipc-
> discuss...@lists.sourceforge.net
> Subject: Re: [net-next 1/1] tipc: avoid unnecessary copying of bundled
> messages
> 
> From: Jon Maloy <jon.ma...@ericsson.com>
> Date: Wed, 14 Feb 2018 13:50:31 +0100
> 
> > diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 4e1c6f6..a368fa8
> > 100644
> > --- a/net/tipc/msg.c
> > +++ b/net/tipc/msg.c
> > @@ -434,6 +434,9 @@ bool tipc_msg_extract(struct sk_buff *skb, struct
> sk_buff **iskb, int *pos)
> >     skb_pull(*iskb, offset);
> >     imsz = msg_size(buf_msg(*iskb));
> >     skb_trim(*iskb, imsz);
> > +
> > +   /* Scale extracted buffer's truesize to avoid double accounting */
> > +   (*iskb)->truesize = SKB_TRUESIZE(imsz);
> >     if (unlikely(!tipc_msg_validate(iskb)))
> >             goto none;
> >     *pos += align(imsz);
> 
> As Eric said, you have to be really careful here.
> 
> If you clone a 10K SKB 10 times, you really have to account for the full
> truesize 10 times.
> 
> That is unless you explicitly trim off frags in the new clone, then adjust the
> truesize by explicitly decreasing it by the amount of memory backing the
> frags you trimmed off completely (not partially).

The buffers we are cloning are linearized 1 MTU incoming buffers. There are no 
fragments. 
Each clone normally points to only a tiny fraction of the data area of the base 
buffer.
I don't claim that copying always is bad, but in this case it happens in the 
majority of cases, and as I see it completely unnecessarily.

There is actually some under accounting, however, since we now only count the 
data area of the base buffer (== the sum of the data area of the clones) plus 
the overhead of the clones.
A more accurate calculation, taking into account even the overhead of the base 
buffer, would look  like this:
(*iskb)->truesize =  SKB_TRUSIZE(imsz) + (skb->truesize - skb->len)  / 
msg_msgcnt(msg);

I.e.,  we calculate the overhead of the base buffer and divide it equally among 
the clones.
Now I really can't see we are missing anything.

BR
///jon

> 
> Finally, you can only do this on an SKB that has never entered a socket SKB
> queue, otherwise you corrupt memory accounting.

Reply via email to