On Mon, May 11, 2020 at 08:52 PM CEST, Jakub Sitnicki wrote:
> Add a new program type BPF_PROG_TYPE_SK_LOOKUP and a dedicated attach type
> called BPF_SK_LOOKUP. The new program kind is to be invoked by the
> transport layer when looking up a socket for a received packet.
>
> When called, SK_LOOKUP program can select a socket that will receive the
> packet. This serves as a mechanism to overcome the limits of what bind()
> API allows to express. Two use-cases driving this work are:
>
> (1) steer packets destined to an IP range, fixed port to a socket
>
> 192.0.2.0/24, port 80 -> NGINX socket
>
> (2) steer packets destined to an IP address, any port to a socket
>
> 198.51.100.1, any port -> L7 proxy socket
>
> In its run-time context, program receives information about the packet that
> triggered the socket lookup. Namely IP version, L4 protocol identifier, and
> address 4-tuple. Context can be further extended to include ingress
> interface identifier.
>
> To select a socket BPF program fetches it from a map holding socket
> references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...)
> helper to record the selection. Transport layer then uses the selected
> socket as a result of socket lookup.
>
> This patch only enables the user to attach an SK_LOOKUP program to a
> network namespace. Subsequent patches hook it up to run on local delivery
> path in ipv4 and ipv6 stacks.
>
> Suggested-by: Marek Majkowski <[email protected]>
> Reviewed-by: Lorenz Bauer <[email protected]>
> Signed-off-by: Jakub Sitnicki <[email protected]>
> ---
>
> Notes:
> v2:
> - Make bpf_sk_assign reject sockets that don't use RCU freeing.
> Update bpf_sk_assign docs accordingly. (Martin)
> - Change bpf_sk_assign proto to take PTR_TO_SOCKET as argument. (Martin)
> - Fix broken build when CONFIG_INET is not selected. (Martin)
> - Rename bpf_sk_lookup{} src_/dst_* fields remote_/local_*. (Martin)
I forgot to call out one more change in v2 to this patch:
- Enforce BPF_SK_LOOKUP attach point on load & attach. (Martin)
[...]