> -----Original Message----- > From: netdev-ow...@vger.kernel.org [mailto:netdev- > ow...@vger.kernel.org] On Behalf Of David Miller > Sent: Wednesday, February 14, 2018 21:27 > To: Jon Maloy <jon.ma...@ericsson.com> > Cc: netdev@vger.kernel.org; Mohan Krishna Ghanta Krishnamurthy > <mohan.krishna.ghanta.krishnamur...@ericsson.com>; Tung Quang Nguyen > <tung.q.ngu...@dektech.com.au>; Hoang Huu Le > <hoang.h...@dektech.com.au>; Canh Duc Luu > <canh.d....@dektech.com.au>; Ying Xue <ying....@windriver.com>; tipc- > discuss...@lists.sourceforge.net > Subject: Re: [net-next 1/1] tipc: avoid unnecessary copying of bundled > messages > > From: Jon Maloy <jon.ma...@ericsson.com> > Date: Wed, 14 Feb 2018 13:50:31 +0100 > > > diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 4e1c6f6..a368fa8 > > 100644 > > --- a/net/tipc/msg.c > > +++ b/net/tipc/msg.c > > @@ -434,6 +434,9 @@ bool tipc_msg_extract(struct sk_buff *skb, struct > sk_buff **iskb, int *pos) > > skb_pull(*iskb, offset); > > imsz = msg_size(buf_msg(*iskb)); > > skb_trim(*iskb, imsz); > > + > > + /* Scale extracted buffer's truesize to avoid double accounting */ > > + (*iskb)->truesize = SKB_TRUESIZE(imsz); > > if (unlikely(!tipc_msg_validate(iskb))) > > goto none; > > *pos += align(imsz); > > As Eric said, you have to be really careful here. > > If you clone a 10K SKB 10 times, you really have to account for the full > truesize 10 times. > > That is unless you explicitly trim off frags in the new clone, then adjust the > truesize by explicitly decreasing it by the amount of memory backing the > frags you trimmed off completely (not partially).
The buffers we are cloning are linearized 1 MTU incoming buffers. There are no fragments. Each clone normally points to only a tiny fraction of the data area of the base buffer. I don't claim that copying always is bad, but in this case it happens in the majority of cases, and as I see it completely unnecessarily. There is actually some under accounting, however, since we now only count the data area of the base buffer (== the sum of the data area of the clones) plus the overhead of the clones. A more accurate calculation, taking into account even the overhead of the base buffer, would look like this: (*iskb)->truesize = SKB_TRUSIZE(imsz) + (skb->truesize - skb->len) / msg_msgcnt(msg); I.e., we calculate the overhead of the base buffer and divide it equally among the clones. Now I really can't see we are missing anything. BR ///jon > > Finally, you can only do this on an SKB that has never entered a socket SKB > queue, otherwise you corrupt memory accounting.