On 02/25/2019 02:10 AM, Daniel Borkmann wrote:
> My understanding is that before doing any writes into skb, we should make
> sure the data area is private to us (and offset in linear data). In tc BPF
> (ingress, egress) we use bpf_try_make_writable() helper for this, others
> like act_{pedit,skbmod} or ovs have similar logic before writing into skb,
> note that in all these cases it's mostly about generic writes, so location
> could also be L4, for example.
>
> Difference of above helper compared to net/sched/sch_*.c instances could
> be that it's i) for the qdisc case it's only on egress INET_ECN_set_ce()
> and that there may be a convention that qdiscs specifically may mangle
> it whereas the helper could be called on ingress and egress and confuse
> other subsystems since they won't see original or race by seeing partially
> updated (invalid) packet.
>
> Eric, have a chance to clarify? Perhaps then would make sense to disallow
> the helper in cgroup ingress path.
Good observations Daniel, thanks for bringing this up.
skb_ensure_writable() seems a big hammer for the case we change some bits in IP
header.
TCP cloned packets certainly can have their headers mangled, so maybe
we need to use something using skb_header_cloned() instead of skb_cloned()