On Tue, Nov 14, 2017 at 11:13:10AM -0800, Tom Herbert wrote: > On Tue, Nov 14, 2017 at 10:24 AM, Shaohua Li <s...@kernel.org> wrote: > > On Wed, Nov 08, 2017 at 09:44:51AM -0800, Tom Herbert wrote: > >> On Fri, Aug 18, 2017 at 3:27 PM, David Miller <da...@davemloft.net> wrote: > >> > From: Martin KaFai Lau <ka...@fb.com> > >> > Date: Fri, 18 Aug 2017 13:51:36 -0700 > >> > > >> >> It seems like that middle box specifically drops TCP_RST if it > >> >> does not know anything about this flow. Since the flowlabel of the > >> >> TCP_RST > >> >> (sent in TW state) is always different, it always lands to a different > >> >> middle > >> >> box. All of these TCP_RST cannot be delivered. > >> > > >> > This really is illegal behavior. The flow label is not a flow _KEY_ > >> > by any definition whatsoever. > >> > > >> > Flow labels are an optimization, not a determinant for flow matching > >> > particularly for proper TCP state processing. > >> > > >> > I'd rather you invest all of this energy getting that vendor to fix > >> > their kit. > >> > > >> We're now seeing several router vendors recommending people to not use > >> flow labels for ECMP hashing. This is precisely because when a flow > >> label changes, network devices that maintain state (firewalls, NAT, > >> load balancers) can't deal with packets being rerouted so connections > >> are dropped. Unfortunately, the need for packets of a flow to always > >> follow the same path has become an implicit requirement that I think > >> we need follow at least as the default behavior. > >> > >> Martin: is there any change you could resurrect these patches? In > >> order to solve the general problem of making routing consistent, I > >> believe we want to keep sk_tx_hash consistent for the connection from > >> which a consistent flow label can be derived. To avoid the overhead of > >> a hash field in sk_common, maybe we could initially set a connection > >> hash to a five-tuple hash for a flow instead of a random value? So in > >> TW state the consistent hash can be computed on the fly. > > > > Hi Tom, > > Do we really need to use the five-tupe hash? There are several places using > > current random hash, which looks more lightweight. To fix issue, we only > > need > > to make sure reset packet include the correct flowlabel. Like what my > > previous > > patch did, we can set tw->tw_flowlabel in tcp_time_wait based on txhash and > > use > > it reset packet. In this way we can use the random hash and not add extra > > field > > in sock. > > > Shaohua, > > But that patch discards the full txhash in TW. So it's not just a > problem with the flow label. sk_tx_hash can also be used for route > selection in ECMP, port selection we're doing tunneling, etc. The > general solution should maintains tx_hash or be able to reconstruct it > in any state, flow label fix is a point solution.
Hi Tom, do you want to keep sk_rethink_txhash() then? If we changed the hash to random number, we can't reconstruct it for sure. Thanks, Shaohua