----- m...@mojatatu.com wrote: > Liran Alon <liran.a...@oracle.com> writes: > > > [...] > > >> Overall I think it might be nice to not need scrubbing skb in such > >> cases, > >> although my concern would be that this has potential to break > >> existing > >> setups when they would expect mark being zero on other veth peer > in > >> any > >> case since it's the behavior for a long time already. The safer > >> option > >> would be to have some sort of explicit opt-in e.g. on link creation > to > >> let > >> the skb->mark pass through unscrubbed. This would definitely be a > >> useful > >> option e.g. when mark is set in the netns facing veth via > >> clsact/egress > >> on xmit and when the container is unprivileged anyway. > >> > >> Thanks, > >> Daniel > > > > I see your point in regards to backwards comparability. > > However, not scrubbing skb when it cross netns via some kernel > functions compared to > > others is basically a bug which could easily break with a little bit > of more refactoring. > > Therefore, it seems a bit weird to me to from now on, we will force > > every user on link creation to consider that once there was a bug > leading > > to this weird behavior on specific netdevs. > > Thus, I suggest to maybe control this via a global /proc/sys/net > file instead. > > One valid use case could be preserving a source namespace nsid in > skb->mark when a packet crosses netns.
Before and after this commit, veth peers that crosses netns zero skb->mark. Therefore, what you suggest is a new feature unrelated to the issue fixed by this commit. I still think that default behavior should be to zero skb->mark only when skb cross netdevs in different netns. And for backwards comparability, we can consider adding a /proc/sys/net/core file to let dev_forward_skb() to always scrub packet regardless if it crosses netdevs or not.