On 5/1/2016 4:10 PM, Sowmini Varadhan wrote:
An arbitration scheme for duelling SYNs is implemented as part of
commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
involved will arrive at the same arbitration decision. However, this
needs to be synchronized with an outgoing SYN to be generated by
rds_tcp_conn_connect(). This commit achieves the synchronization
through the t_conn_lock mutex in struct rds_tcp_connection.
The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
the t_conn_lock mutex. A SYN is sent out only if the RDS connection is
not already UP (an UP would indicate that rds_tcp_accept_one() has
completed 3WH, so no SYN needs to be generated).
Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
acquiring the t_conn_lock mutex. The only acceptable states (to
allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
outgoing SYN before we saw incoming SYN).
Signed-off-by: Sowmini Varadhan <sowmini.varad...@oracle.com>
---
net/rds/tcp.c | 1 +
net/rds/tcp.h | 4 ++++
net/rds/tcp_connect.c | 8 ++++++++
net/rds/tcp_listen.c | 30 ++++++++++++++++++++----------
4 files changed, 33 insertions(+), 10 deletions(-)
[...]
diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c
index 5cb1687..49a3fcf 100644
--- a/net/rds/tcp_connect.c
+++ b/net/rds/tcp_connect.c
@@ -78,7 +78,14 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
struct socket *sock = NULL;
struct sockaddr_in src, dest;
int ret;
+ struct rds_tcp_connection *tc = conn->c_transport_data;
+
+ mutex_lock(&tc->t_conn_lock);
+ if (rds_conn_up(conn)) {
+ mutex_unlock(&tc->t_conn_lock);
+ return 0;
+ }
ret = sock_create_kern(rds_conn_net(conn), PF_INET,
SOCK_STREAM, IPPROTO_TCP, &sock);
if (ret < 0)
@@ -120,6 +127,7 @@ int rds_tcp_conn_connect(struct rds_connection *conn)
}
out:
+ mutex_unlock(&tc->t_conn_lock);
Just wondering whether the spin_lock() would better here considering
entry into rds_tcp_conn_connect() & rds_tcp_accept_one() might be
from softirq context. Ignore it if its not applicable.
if (sock)
sock_release(sock);
return ret;
diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index 0896187..cc8496f 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -76,7 +76,9 @@ int rds_tcp_accept_one(struct socket *sock)
struct rds_connection *conn;
int ret;
struct inet_sock *inet;
- struct rds_tcp_connection *rs_tcp;
+ struct rds_tcp_connection *rs_tcp = NULL;
+ int conn_state;
+ struct sock *nsk;
ret = sock_create_kern(sock_net(sock->sk), sock->sk->sk_family,
sock->sk->sk_type, sock->sk->sk_protocol,
@@ -116,6 +118,10 @@ int rds_tcp_accept_one(struct socket *sock)
*/
rs_tcp = (struct rds_tcp_connection *)conn->c_transport_data;
rds_conn_transition(conn, RDS_CONN_DOWN, RDS_CONN_CONNECTING);
Like patch 1/2, probably we can leverage return value of above.
+ conn_state = rds_conn_state(conn);
+ if (conn_state != RDS_CONN_CONNECTING && conn_state != RDS_CONN_UP)
You probably don't need the local 'conn_state' and below should work.
if (!rds_conn_connecting(conn) && !rds_conn_up(conn))
Regards,
Santosh