On Thu, 18 Oct 2018 at 22:07, Martin Lau <ka...@fb.com> wrote: > > On Thu, Oct 18, 2018 at 04:52:40PM -0700, Joe Stringer wrote: > > On Thu, 18 Oct 2018 at 14:20, Daniel Borkmann <dan...@iogearbox.net> wrote: > > > > > > On 10/18/2018 11:06 PM, Joe Stringer wrote: > > > > On Thu, 18 Oct 2018 at 11:54, Nitin Hande <nitin.ha...@gmail.com> wrote: > > > [...] > > > >> Open Issue > > > >> * The underlying code relies on presence of an skb to find out the > > > >> right sk for the case of REUSEPORT socket option. Since there is > > > >> no skb available at XDP hookpoint, the helper function will return > > > >> the first available sk based off the 5 tuple hash. If the desire > > > >> is to return a particular sk matching reuseport_cb function, please > > > >> suggest way to tackle it, which can be addressed in a future commit. > > > > > > >> Signed-off-by: Nitin Hande <nitin.ha...@gmail.com> > > > > > > > > Thanks Nitin, LGTM overall. > > > > > > > > The REUSEPORT thing suggests that the usage of this helper from XDP > > > > layer may lead to a different socket being selected vs. the equivalent > > > > call at TC hook, or other places where the selection may occur. This > > > > could be a bit counter-intuitive. > > > > > > > > One thought I had to work around this was to introduce a flag, > > > > something like BPF_F_FIND_REUSEPORT_SK_BY_HASH. This flag would > > > > effectively communicate in the API that the bpf_sk_lookup_xxx() > > > > functions will only select a REUSEPORT socket based on the hash and > > > > not by, for example BPF_PROG_TYPE_SK_REUSEPORT programs. The absence > > > > of the flag would support finding REUSEPORT sockets by other > > > > mechanisms (which would be allowed for now from TC hooks but would be > > > > disallowed from XDP, since there's no specific plan to support this). > > > > > > Hmm, given skb is NULL here the only way to lookup the socket in such > > > scenario is based on hash, that is, inet_ehashfn() / inet6_ehashfn(), > > > perhaps alternative is to pass this hash in from XDP itself to the > > > helper so it could be custom selector. Do you have a specific use case > > > on this for XDP (just curious)? > > > > I don't have a use case for SO_REUSEPORT introspection from XDP, so > > I'm primarily thinking from the perspective of making the behaviour > > clear in the API in a way that leaves open the possibility for a > > reasonable implementation in future. From that perspective, my main > > concern is that it may surprise some BPF writers that the same > > "bpf_sk_lookup_tcp()" call (with identical parameters) may have > > different behaviour at TC vs. XDP layers, as the BPF selection of > > sockets is respected at TC but not at XDP. > > > > FWIW we're already out of parameters for the actual call, so if we > > wanted to allow passing a hash in, we'd need to either dedicate half > > the 'flags' field for this configurable hash, or consider adding the > > new hash parameter to 'struct bpf_sock_tuple'. > > > > +Martin for any thoughts on SO_REUSEPORT and XDP here. > The XDP/TC prog has read access to the sk fields through > 'struct bpf_sock'? > > A quick thought... > Considering all sk in the same reuse->socks[] share > many things (e.g. family,type,protocol,ip,port..etc are the same), > I wonder returning which particular sk from reuse->socks[] will > matter too much since most of the fields from 'struct bpf_sock' will > be the same. Some of fields in 'struct bpf_sock' could be different > though, like priority? Hence, another possibility is to limit the > accessible fields for the XDP prog. Only allow accessing the fields > that must be the same among the sk in the same reuse->socks[].
This sounds pretty reasonable to me.