This series adds VLAN awareness to bpf_fib_lookup() in both directions. BPF_FIB_LOOKUP_VLAN resolves a VLAN egress to its underlying real device plus the VLAN tag (XDP programs need this because VLAN devices have no XDP xmit), and BPF_FIB_LOOKUP_VLAN_INPUT runs the lookup as if a tagged frame had arrived on the matching VLAN subinterface, for iif policy routing and VRF table selection.
The independent l3mdev/VRF flow-init fix, patch 1 in v1 and v2, was split out and merged to bpf separately. An unreducible VLAN egress (a QinQ egress, or a parent in another namespace) returns BPF_FIB_LKUP_RET_VLAN_FAILURE rather than a best-effort SUCCESS, so an XDP program cannot mistake it for a physical egress and silently blackhole the frame at xdp_do_flush(). The code is appended after BPF_FIB_LKUP_RET_NO_SRC_ADDR (nothing renumbered, tools/ mirror updated) and is returned only when BPF_FIB_LOOKUP_VLAN is set, so no existing caller can observe it. On that failure params->ifindex is left at the input; a program that wants the VLAN device's own ifindex re-issues without the flag. Changes v4 -> v5 (Toke's review, https://lore.kernel.org/bpf/[email protected]/): - Patch 1: BPF_FIB_LOOKUP_VLAN only makes sense for XDP, which cannot redirect to a VLAN device; a tc program can redirect to the VLAN device directly. So bpf_skb_fib_lookup() now rejects the flag with -EINVAL, and the fwd_dev out-parameter added in v4 is dropped: with the flag gone from the skb path there is no swap to preserve, so the deferred mtu check returns to the original dev_get_by_index_rcu(net, params->ifindex). The VLAN_FAILURE rewind moves into bpf_fib_set_fwd_params() via an input ifindex parameter, so each lookup ends in a plain "return bpf_fib_set_fwd_params(...)". The early params->ifindex = dev->ifindex that NO_NEIGH and NO_SRC_ADDR report stays where d1c362e1dd68a ("bpf: Always return target ifindex in bpf_fib_lookup") put it. Dropping fwd_dev also removes the i386 W=1 unused-variable warning the kernel test robot reported, since net is used again. - Patch 2: no code change; add Toke's Reviewed-by. - Patch 3: the BPF_FIB_LOOKUP_VLAN cases assert the tc helper returns -EINVAL and check the egress result on the XDP path, including dmac and (for tot_len cases) the route mtu_result; the cross-netns egress case runs through bpf_xdp_fib_lookup(); the obsolete skb-mtu-after-swap arm is dropped. Changes v3 -> v4: - Patch 1: return BPF_FIB_LKUP_RET_VLAN_FAILURE for an unreducible VLAN egress, leaving params->ifindex at the input, per Toke's v3 review. - Patch 3: QinQ-egress and cross-namespace-egress arms expect VLAN_FAILURE; an escape-hatch arm re-issues without the flag; and a live-frames arm asserts a reducible egress is delivered and a QinQ egress is passed to the stack. Taking the tag as lookup input follows the approach David Ahern suggested in the 2021 fwmark discussion: https://lore.kernel.org/bpf/[email protected]/ v4: https://lore.kernel.org/all/[email protected]/ v3: https://lore.kernel.org/all/[email protected]/ v2: https://lore.kernel.org/all/[email protected]/ v1: https://lore.kernel.org/all/[email protected]/ Avinash Duduskar (3): bpf: Add BPF_FIB_LOOKUP_VLAN flag to bpf_fib_lookup() helper bpf: Add BPF_FIB_LOOKUP_VLAN_INPUT flag to bpf_fib_lookup() helper selftests/bpf: Add bpf_fib_lookup() VLAN flag tests include/uapi/linux/bpf.h | 50 +- net/core/filter.c | 97 ++- tools/include/uapi/linux/bpf.h | 50 +- .../selftests/bpf/prog_tests/fib_lookup.c | 717 +++++++++++++++++- .../testing/selftests/bpf/progs/fib_lookup.c | 36 + 5 files changed, 936 insertions(+), 14 deletions(-) base-commit: a975094bf98ca97be9146f9d3b5681a6f9cf5ce3 -- 2.54.0

