On 6/22/2020 6:53 PM, Eric Dumazet wrote:


On 6/21/20 7:09 AM, Tariq Toukan wrote:
sock_orphan() call to sk_set_socket() implies clearing the sock TX queue.
This might cause unexpected out-of-order transmit, as outstanding packets
can pick a different TX queue and bypass the ones already queued.
This is undesired in general. More specifically, it breaks the in-order
scheduling property guarantee for device-offloaded TLS sockets.

Introduce a function variation __sk_set_socket() that does not clear
the TX queue, and call it from sock_orphan().
All other callers of sk_set_socket() do not operate on an active socket,
so they do not need this change.

Fixes: e022f0b4a03f ("net: Introduce sk_tx_queue_mapping")
Signed-off-by: Tariq Toukan <tar...@mellanox.com>
Reviewed-by: Boris Pismenny <bor...@mellanox.com>
---
  include/net/sock.h | 9 +++++++--
  1 file changed, 7 insertions(+), 2 deletions(-)

Please queue for -stable.

diff --git a/include/net/sock.h b/include/net/sock.h
index c53cc42b5ab9..23e43f3d79f0 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1846,10 +1846,15 @@ static inline int sk_rx_queue_get(const struct sock *sk)
  }
  #endif
+static inline void __sk_set_socket(struct sock *sk, struct socket *sock)
+{
+       sk->sk_socket = sock;
+}
+
  static inline void sk_set_socket(struct sock *sk, struct socket *sock)
  {
        sk_tx_queue_clear(sk);
-       sk->sk_socket = sock;
+       __sk_set_socket(sk, sock);
  }

Hmm...

I think we should have a single sk_set_socket() call, and remove
the sk_tx_queue_clear() from it.

sk_tx_queue_clear() should be put where it is needed, instead of being hidden
in sk_set_socket()


Thanks Eric, sounds good to me, I will respin, just have some questions below.

diff --git a/include/net/sock.h b/include/net/sock.h
index 
c53cc42b5ab92d0062519e60435b85c75564a967..3428619faae4340485b200f49d9cce4fb09086b3
 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1848,7 +1848,6 @@ static inline int sk_rx_queue_get(const struct sock *sk)
static inline void sk_set_socket(struct sock *sk, struct socket *sock)
  {
-       sk_tx_queue_clear(sk);
         sk->sk_socket = sock;
  }
diff --git a/net/core/sock.c b/net/core/sock.c
index 
6c4acf1f0220b1f925ebcfaa847632ec0dbe0b9b..134de0d37f77ba781b2b3341c94a97a1b2d57a2d
 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1767,6 +1767,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t 
priority,
                 cgroup_sk_alloc(&sk->sk_cgrp_data);
                 sock_update_classid(&sk->sk_cgrp_data);
                 sock_update_netprioidx(&sk->sk_cgrp_data);
+               sk_tx_queue_clear(sk);

Why add it here?
I don't see a call to sk_set_socket().

         }
return sk;
@@ -1990,6 +1991,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const 
gfp_t priority)
                  */
                 sk_refcnt_debug_inc(newsk);
                 sk_set_socket(newsk, NULL);
+               sk_tx_queue_clear(newsk);
                 RCU_INIT_POINTER(newsk->sk_wq, NULL);
if (newsk->sk_prot->sockets_allocated)


So in addition to sock_orphan(), now sk_tx_queue_clear() won't be called also from:
1. sock_graft(): Looks fine to me.
2. sock_init_data(): I think we should add an explicit call to sk_tx_queue_clear() here. The one for RX sk_rx_queue_clear() is already there.
3. mptcp_sock_graft(): Looks fine to me.

Regards,
Tariq

Reply via email to