On Sun, Aug 3, 2025 at 10:54 AM Jason Xing <[email protected]> wrote: > > On Sun, Aug 3, 2025 at 9:59 AM Menglong Dong <[email protected]> wrote: > > > > On Sat, Aug 2, 2025 at 9:06 PM Jason Xing <[email protected]> wrote: > > > > > > Hi Menglong, > > > > > > On Sat, Aug 2, 2025 at 5:28 PM Menglong Dong <[email protected]> > > > wrote: > > > > > > > > For now, the tcp socket lookup will terminate if the socket is reuse > > > > port > > > > in inet_lhash2_lookup(), which makes the socket is not the best match. > > > > > > > > For example, we have socket1 and socket2 both listen on "0.0.0.0:1234", > > > > but socket1 bind on "eth0". We create socket1 first, and then socket2. > > > > Then, all connections will goto socket2, which is not expected, as > > > > socket1 > > > > has higher priority. > > > > > > > > This can cause unexpected behavior if TCP MD5 keys is used, as described > > > > in Documentation/networking/vrf.rst -> Applications. > > > > > > > > Therefor, we lookup the best matched socket first, and then do the reuse > > > > > > s/Therefor/Therefore > > > > > > > port logic. This can increase some overhead if there are many reuse port > > > > socket :/ > > > > > > > > Fixes: c125e80b8868 ("soreuseport: fast reuseport TCP socket selection") > > > > Signed-off-by: Menglong Dong <[email protected]> > > > > --- > > > > v3: > > > > * use the approach in V1 > > > > * add the Fixes tag > > > > --- > > > > net/ipv4/inet_hashtables.c | 13 +++++++------ > > > > net/ipv6/inet6_hashtables.c | 13 +++++++------ > > > > 2 files changed, 14 insertions(+), 12 deletions(-) > > > > > > > > diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c > > > > index ceeeec9b7290..51751337f394 100644 > > > > --- a/net/ipv4/inet_hashtables.c > > > > +++ b/net/ipv4/inet_hashtables.c > > > > @@ -389,17 +389,18 @@ static struct sock *inet_lhash2_lookup(const > > > > struct net *net, > > > > sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) { > > > > score = compute_score(sk, net, hnum, daddr, dif, sdif); > > > > if (score > hiscore) { > > > > - result = inet_lookup_reuseport(net, sk, skb, > > > > doff, > > > > - saddr, sport, > > > > daddr, hnum, inet_ehashfn); > > > > - if (result) > > > > - return result; > > > > - > > > > result = sk; > > > > hiscore = score; > > > > } > > > > } > > > > > > > > - return result; > > > > + if (!result) > > > > + return NULL; > > > > + > > > > + sk = inet_lookup_reuseport(net, result, skb, doff, > > > > + saddr, sport, daddr, hnum, > > > > inet_ehashfn); > > > > + > > > > + return sk ? sk : result; > > > > } > > > > > > IMHO, I don't see it as a bugfix. So can you elaborate on what the exact > > > side effect you're faced with is when the algorithm finally prefers > > > socket2 (without > > > this patch)? > > > > Hi, Jason. The case is that the user has several NIC, > > and there are some sockets that are binded to them, > > who listen on TCP port 6666. And a global socket doesn't > > bind any NIC and listens on TCP port 6666. > > > > In theory, the connection request from the NIC will goto > > the listen socket that is binded on it. When on socket > > is binded on the NIC, it goto the global socket. However, > > the connection request always goto the global socket, which > > is not expected. > > > > What's worse is that when TCP MD5 is used on the socket, > > the connection will fail :/ > > I'm trying to picture what the usage can be in the userland as you > pointed out in the MD5 case. As to the client side, it seems weird > since it cannot detect and know the priority of the other side where a > few sockets listen on the same address and port.
For the server side, it needs to add all the clients information with the TCP_MD5SIG option. For socket1 that binded on the eth0, it will add all the client addresses that come from eth0 to the socket1 with TCP_MD5SIG. So the server knows the clients. And in my use case, the TCP MD5 + VRF is used. The details are a little fuzzy for me, which I need to do some recalling :/ > > I'm not saying the priority problem doesn't exist, just not knowing > how severe the case could be. It doesn't look that bad at least until > now. Only the selection policy itself matters more to the server side > than to the client side. > > Thanks, > Jason

