On 12/10/2018 06:49 AM, Florian Westphal wrote:
> The (out-of-tree) Multipath-TCP implementation needs a significant amount
> of extra space in the skb control buffer.
Which skbs ? Input or output path ?
>
> Increasing skb->cb[] size in mainline is a non-starter for memory and
> and performance reasons (f.e. increase in cb size also moves several
> frequently-accessed fields to other cache lines).
>
> One approach that might work for MPTCP is to extend skb_shared_info instead
> of sk_buff. However, this comes with other drawbacks, e.g. it either
> needs special skb allocation to make sure there is enough space for such
> 'extended shinfo' at the end of data buffer (which would make this only
> useable for the MPTCP tx path) or such a change would increase size of
> skb_shared_info.
>
> This work adds an extension infrastructure for sk_buff:
> 1. extension memory is released when the sk_buff is free'd.
> 2. data is shared after cloning an skb.
>
This seems additional atomic increments and decrements all over the places,
and code bloat for a very precise reason :
skb->cb[] is too small.
We do not want to increase skb->cb[] for two reasons, the first one being the
killer.
1) we clear it at skb allocation, and copy it at skb cloning.
2) extra memory cost.
Why can't we have another skb->cb2[] field that is not cleared/copied by skb
functions at all ?
Each layer using skb->cb2[] would be responsible to fully manage it.
I do not know what are MPTCP needs, but I doubt adding XX bytes to skb will
have serious memory cost impact for any MPTCP users.