On Sun, Aug 3, 2025 at 10:54 AM Jason Xing <[email protected]> wrote:
>
> On Sun, Aug 3, 2025 at 9:59 AM Menglong Dong <[email protected]> wrote:
> >
> > On Sat, Aug 2, 2025 at 9:06 PM Jason Xing <[email protected]> wrote:
> > >
> > > Hi Menglong,
> > >
> > > On Sat, Aug 2, 2025 at 5:28 PM Menglong Dong <[email protected]> 
> > > wrote:
> > > >
> > > > For now, the tcp socket lookup will terminate if the socket is reuse 
> > > > port
> > > > in inet_lhash2_lookup(), which makes the socket is not the best match.
> > > >
> > > > For example, we have socket1 and socket2 both listen on "0.0.0.0:1234",
> > > > but socket1 bind on "eth0". We create socket1 first, and then socket2.
> > > > Then, all connections will goto socket2, which is not expected, as 
> > > > socket1
> > > > has higher priority.
> > > >
> > > > This can cause unexpected behavior if TCP MD5 keys is used, as described
> > > > in Documentation/networking/vrf.rst -> Applications.
> > > >
> > > > Therefor, we lookup the best matched socket first, and then do the reuse
> > >
> > > s/Therefor/Therefore
> > >
> > > > port logic. This can increase some overhead if there are many reuse port
> > > > socket :/
> > > >
> > > > Fixes: c125e80b8868 ("soreuseport: fast reuseport TCP socket selection")
> > > > Signed-off-by: Menglong Dong <[email protected]>
> > > > ---
> > > > v3:
> > > > * use the approach in V1
> > > > * add the Fixes tag
> > > > ---
> > > >  net/ipv4/inet_hashtables.c  | 13 +++++++------
> > > >  net/ipv6/inet6_hashtables.c | 13 +++++++------
> > > >  2 files changed, 14 insertions(+), 12 deletions(-)
> > > >
> > > > diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> > > > index ceeeec9b7290..51751337f394 100644
> > > > --- a/net/ipv4/inet_hashtables.c
> > > > +++ b/net/ipv4/inet_hashtables.c
> > > > @@ -389,17 +389,18 @@ static struct sock *inet_lhash2_lookup(const 
> > > > struct net *net,
> > > >         sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) {
> > > >                 score = compute_score(sk, net, hnum, daddr, dif, sdif);
> > > >                 if (score > hiscore) {
> > > > -                       result = inet_lookup_reuseport(net, sk, skb, 
> > > > doff,
> > > > -                                                      saddr, sport, 
> > > > daddr, hnum, inet_ehashfn);
> > > > -                       if (result)
> > > > -                               return result;
> > > > -
> > > >                         result = sk;
> > > >                         hiscore = score;
> > > >                 }
> > > >         }
> > > >
> > > > -       return result;
> > > > +       if (!result)
> > > > +               return NULL;
> > > > +
> > > > +       sk = inet_lookup_reuseport(net, result, skb, doff,
> > > > +                                  saddr, sport, daddr, hnum, 
> > > > inet_ehashfn);
> > > > +
> > > > +       return sk ? sk : result;
> > > >  }
> > >
> > > IMHO, I don't see it as a bugfix. So can you elaborate on what the exact
> > > side effect you're faced with is when the algorithm finally prefers
> > > socket2 (without
> > > this patch)?
> >
> > Hi, Jason. The case is that the user has several NIC,
> > and there are some sockets that are binded to them,
> > who listen on TCP port 6666. And a global socket doesn't
> > bind any NIC and listens on TCP port 6666.
> >
> > In theory, the connection request from the NIC will goto
> > the listen socket that is binded on it. When on socket
> > is binded on the NIC, it goto the global socket. However,
> > the connection request always goto the global socket, which
> > is not expected.
> >
> > What's worse is that when TCP MD5 is used on the socket,
> > the connection will fail :/
>
> I'm trying to picture what the usage can be in the userland as you
> pointed out in the MD5 case. As to the client side, it seems weird
> since it cannot detect and know the priority of the other side where a
> few sockets listen on the same address and port.

For the server side, it needs to add all the clients information
with the TCP_MD5SIG option. For socket1 that binded on the
eth0, it will add all the client addresses that come from eth0 to the
socket1 with TCP_MD5SIG. So the server knows the clients.

And in my use case, the TCP MD5 + VRF is used. The details are
a little fuzzy for me, which I need to do some recalling :/

>
> I'm not saying the priority problem doesn't exist, just not knowing
> how severe the case could be. It doesn't look that bad at least until
> now. Only the selection policy itself matters more to the server side
> than to the client side.
>
> Thanks,
> Jason

Reply via email to