On Tue, Nov 14, 2017 at 10:24 AM, Shaohua Li <s...@kernel.org> wrote:
> On Wed, Nov 08, 2017 at 09:44:51AM -0800, Tom Herbert wrote:
>> On Fri, Aug 18, 2017 at 3:27 PM, David Miller <da...@davemloft.net> wrote:
>> > From: Martin KaFai Lau <ka...@fb.com>
>> > Date: Fri, 18 Aug 2017 13:51:36 -0700
>> >
>> >> It seems like that middle box specifically drops TCP_RST if it
>> >> does not know anything about this flow.  Since the flowlabel of the 
>> >> TCP_RST
>> >> (sent in TW state) is always different, it always lands to a different 
>> >> middle
>> >> box.  All of these TCP_RST cannot be delivered.
>> >
>> > This really is illegal behavior.  The flow label is not a flow _KEY_
>> > by any definition whatsoever.
>> >
>> > Flow labels are an optimization, not a determinant for flow matching
>> > particularly for proper TCP state processing.
>> >
>> > I'd rather you invest all of this energy getting that vendor to fix
>> > their kit.
>> >
>> We're now seeing several router vendors recommending people to not use
>> flow labels for ECMP hashing. This is precisely because when a flow
>> label changes, network devices that maintain state (firewalls, NAT,
>> load balancers) can't deal with packets being rerouted so connections
>> are dropped. Unfortunately, the need for packets of a flow to always
>> follow the same path has become an implicit requirement that I think
>> we need follow at least as the default behavior.
>>
>> Martin: is there any change you could resurrect these patches? In
>> order to solve the general problem of making routing consistent, I
>> believe we want to keep sk_tx_hash consistent for the connection from
>> which a consistent flow label can be derived. To avoid the overhead of
>> a hash field in sk_common, maybe we could initially set a connection
>> hash to a five-tuple hash for a flow instead of a random value? So in
>> TW state the consistent hash can be computed on the fly.
>
> Hi Tom,
> Do we really need to use the five-tupe hash? There are several places using
> current random hash, which looks more lightweight. To fix issue, we only need
> to make sure reset packet include the correct flowlabel. Like what my previous
> patch did, we can set tw->tw_flowlabel in tcp_time_wait based on txhash and 
> use
> it reset packet. In this way we can use the random hash and not add extra 
> field
> in sock.
>
Shaohua,

But that patch discards the full txhash in TW. So it's not just a
problem with the flow label. sk_tx_hash can also be used for route
selection in ECMP, port selection we're doing tunneling, etc. The
general solution should maintains tx_hash or be able to reconstruct it
in any state, flow label fix is a point solution.

Thanks,
Tom

> Thanks,
> Shaohua

Reply via email to