On Wed, Nov 23, 2005 at 07:01:59PM +0100, Olaf Kirch wrote: > We've seen this previously, and submitted a fix to netfilter which > supposedly went into mainline at some point. It seems to be gone > from 2.6.14 though.
And here are the two patches that came out of the netfilter discussion at that time. Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
Subject: Keep netfilter from dropping ACK probes for half-open connections From: Jozsef Kadlecsik Patch-mainline: submitted by Joszef Kadlecsik References: SUSE46818 Mounting NFS file systems after a (warm) reboot could take a long time if firewalling and connection tracking was enabled. The reason is that the NFS clients tends to use the same ports (800 and counting down). Now on reboot, the server would still have a TCB for an existing TCP connection client:800 -> server:2049. The client sends a SYN from port 800 to server:2049, which elicits an ACK from the server. The firewall on the client drops the ACK because (from its point of view) the connection is still in half-open state, and it expects to see a SYNACK. The client will eventually time out after several minutes. The following patch corrects this, by accepting ACKs on half open connections as well. Acked-By: Olaf Kirch <[EMAIL PROTECTED]> Index: linux-2.6.8/include/linux/netfilter_ipv4/ip_conntrack_tcp.h =================================================================== --- linux-2.6.8.orig/include/linux/netfilter_ipv4/ip_conntrack_tcp.h +++ linux-2.6.8/include/linux/netfilter_ipv4/ip_conntrack_tcp.h @@ -18,7 +18,7 @@ enum tcp_conntrack { }; /* Window scaling is advertised by the sender */ -#define IP_CT_TCP_STATE_FLAG_WINDOW_SCALE 0x01 +#define IP_CT_TCP_FLAG_WINDOW_SCALE 0x01 /* SACK is permitted by the sender */ #define IP_CT_TCP_FLAG_SACK_PERM 0x02 Index: linux-2.6.8/net/ipv4/netfilter/ip_conntrack_proto_tcp.c =================================================================== --- linux-2.6.8.orig/net/ipv4/netfilter/ip_conntrack_proto_tcp.c +++ linux-2.6.8/net/ipv4/netfilter/ip_conntrack_proto_tcp.c @@ -273,9 +273,9 @@ static enum tcp_conntrack tcp_conntracks * sCL -> sCL */ /* sNO, sSS, sSR, sES, sFW, sCW, sLA, sTW, sCL, sLI */ -/*ack*/ { sIV, sIV, sIV, sES, sCW, sCW, sTW, sTW, sCL, sIV }, +/*ack*/ { sIV, sIG, sIV, sES, sCW, sCW, sTW, sTW, sCL, sIV }, /* - * sSS -> sIV ACK is invalid: we haven't seen a SYN/ACK yet. + * sSS -> sIG Might be a half-open connection. * sSR -> sIV Simultaneous open. * sES -> sES :-) * sFW -> sCW Normal close request answered by ACK. @@ -436,7 +436,7 @@ static void tcp_options(const struct sk_ state->td_scale = 14; } state->flags |= - IP_CT_TCP_STATE_FLAG_WINDOW_SCALE; + IP_CT_TCP_FLAG_WINDOW_SCALE; } ptr += opsize - 2; length -= opsize; @@ -552,8 +552,8 @@ static int tcp_in_window(struct ip_ct_tc * Both sides must send the Window Scale option * to enable window scaling in either direction. */ - if (!(sender->flags & IP_CT_TCP_STATE_FLAG_WINDOW_SCALE - && receiver->flags & IP_CT_TCP_STATE_FLAG_WINDOW_SCALE)) + if (!(sender->flags & IP_CT_TCP_FLAG_WINDOW_SCALE + && receiver->flags & IP_CT_TCP_FLAG_WINDOW_SCALE)) sender->td_scale = receiver->td_scale = 0; } else { @@ -566,9 +566,11 @@ static int tcp_in_window(struct ip_ct_tc sender->td_maxwin = (win == 0 ? 1 : win); sender->td_maxend = end + sender->td_maxwin; } - } else if (state->state == TCP_CONNTRACK_SYN_SENT - && dir == IP_CT_DIR_ORIGINAL - && after(end, sender->td_end)) { + } else if (((state->state == TCP_CONNTRACK_SYN_SENT + && dir == IP_CT_DIR_ORIGINAL) + || (state->state == TCP_CONNTRACK_SYN_RECV + && dir == IP_CT_DIR_REPLY)) + && after(end, sender->td_end)) { /* * RFC 793: "if a TCP is reinitialized ... then it need * not wait at all; it must only be sure to use sequence @@ -685,7 +687,7 @@ static int tcp_in_window(struct ip_ct_tc "ip_ct_tcp: %s ", before(end, sender->td_maxend + 1) ? after(seq, sender->td_end - receiver->td_maxwin - 1) ? - before(ack, receiver->td_end + 1) ? + before(sack, receiver->td_end + 1) ? after(ack, receiver->td_end - MAXACKWINDOW(sender)) ? "BUG" : "ACK is under the lower bound (possibly overly delayed ACK)" : "ACK is over the upper bound (ACKed data has never seen yet)" @@ -846,7 +848,9 @@ static int tcp_packet(struct ip_conntrac switch (new_state) { case TCP_CONNTRACK_IGNORE: - /* Either SYN in ORIGINAL, or SYN/ACK in REPLY direction. */ + /* Either SYN in ORIGINAL + * or SYN/ACK in REPLY + * or ACK in REPLY direction (half-open connection). */ if (index == TCP_SYNACK_SET && conntrack->proto.tcp.last_index == TCP_SYN_SET && conntrack->proto.tcp.last_dir != dir @@ -875,7 +879,7 @@ static int tcp_packet(struct ip_conntrac WRITE_UNLOCK(&tcp_lock); if (LOG_INVALID(IPPROTO_TCP)) nf_log_packet(PF_INET, 0, skb, NULL, NULL, - "ip_ct_tcp: invalid SYN (ignored) "); + "ip_ct_tcp: invalid packet ignored "); return NF_ACCEPT; case TCP_CONNTRACK_MAX: /* Invalid packet */ @@ -900,11 +904,12 @@ static int tcp_packet(struct ip_conntrac break; case TCP_CONNTRACK_CLOSE: if (index == TCP_RST_SET - && test_bit(IPS_SEEN_REPLY_BIT, &conntrack->status) - && conntrack->proto.tcp.last_index <= TCP_SYNACK_SET + && ((test_bit(IPS_SEEN_REPLY_BIT, &conntrack->status) + && conntrack->proto.tcp.last_index <= TCP_SYNACK_SET) + || conntrack->proto.tcp.last_index == TCP_ACK_SET) && after(ntohl(th->ack_seq), conntrack->proto.tcp.last_seq)) { - /* Ignore RST closing down invalid SYN + /* Ignore RST closing down invalid SYN or ACK we had let trough. */ WRITE_UNLOCK(&tcp_lock); if (LOG_INVALID(IPPROTO_TCP))
Subject: Keep netfilter from ingnoring all RST if the previous packet was an ACK From: Martin Josefsson Patch-mainline: 2.6.11-rc1 References: SUSE50484 This is incremental fix to netfilter-tcp-rst-ack-fix (#46818) The change was that an RST is ignored if the previous packet was an ACK. This is happens all the time. I know it was intended as a fix for the SYN - ACK probe - RST sequence but it breaks normal usage. The problem is that connections that end with RST never get their state changed and are left in ESTABLISHED state with a large timeout. The patch below adds a check for !test_bit(IPS_ASSURED_BIT, &conntrack->status) so your change will only be active for unassured connections. Acked-By: Karsten Keil <[EMAIL PROTECTED]> --- linux-2.6.8-24.11/net/ipv4/netfilter/ip_conntrack_proto_tcp.c 2004-12-22 15:47:33.000000000 +0100 +++ linux/net/ipv4/netfilter/ip_conntrack_proto_tcp.c 2005-02-04 00:49:32.128760279 +0100 @@ -906,7 +906,8 @@ if (index == TCP_RST_SET && ((test_bit(IPS_SEEN_REPLY_BIT, &conntrack->status) && conntrack->proto.tcp.last_index <= TCP_SYNACK_SET) - || conntrack->proto.tcp.last_index == TCP_ACK_SET) + || (!test_bit(IPS_ASSURED_BIT, &conntrack->status) + && conntrack->proto.tcp.last_index == TCP_ACK_SET)) && after(ntohl(th->ack_seq), conntrack->proto.tcp.last_seq)) { /* Ignore RST closing down invalid SYN or ACK