On 22/02/2019 14:50, wenxu wrote: > On 2019/2/22 下午6:26, Nikolay Aleksandrov wrote: >> On 22/02/2019 11:20, we...@ucloud.cn wrote: >>> From: wenxu <we...@ucloud.cn> >>> >>> Current fib_multipath_hash_policy can make hash based on the L3 or >>> L4. But it only work on the outer IP. So a specific tunnel always >>> has the same hash value. But a specific tunnel may contain so many >>> inner connections. However there is no good way for tunnel packet. >>> A specific tunnel route based on the percpu dst_cache. It will not >>> lookup route table for each packet. >>> >>> This patch provide a based cpu id hash policy. The different >>> connection run on different cpu and there will be different hash >>> value for percpu dst_cache. >>> >>> Signed-off-by: wenxu <we...@ucloud.cn> >>> --- >>> net/ipv4/route.c | 6 ++++++ >>> net/ipv4/sysctl_net_ipv4.c | 2 +- >>> 2 files changed, 7 insertions(+), 1 deletion(-) >>> >> Hi, >> When we had the same issue in the bonding, we added a new mode which used >> the flow dissector to get the inner headers and hash on them. >> I believe that is even easier nowadays, but I think people wanted to >> use different hash algorithms as well and last we discussed this we >> were talking about yet another bpf use to generate the hash. >> Now we got bpf_set_hash() which can be used to achieve this from various >> places, if you use that with hash_policy=1 (L4) then skb->hash will >> be used and you can achieve any hashing that you desire. >> >> Cheers, >> Nik >> >> > Yes, we can set the skb->hash. But ip tunnel lookup the route table without > skb. > > The most important for performance issue most tunnel send work with dst_cache. > > The dst_cache is percpu cache. So the number of dst_entry of a specific > tunnel is the cpu cores . > > So It's more better hash based on outer and cpu id. > >
I was just hoping for a more generic solution, anyway if this would go forward please add documentation for the new option in Documentation/networking/ip-sysctl.txt there's a description of fib_multipath_hash_policy. Also how is this supposed to work if skb is present & skb->hash is set ? I believe it will be ignored, so it really only works for lookups without an skb (or without a generated hash). A thought (unchecked/untested, may not make much sense, feel free to ignore): Why don't you make it mix flowi4_mark in with the hash and add some tunnel option which mixes the CPU in that field ? That is the new '2' mode will mix flowi4_mark with the hash and if some specific tunnel option is set, the tunnel code will mix smp_current_id() with fl4's mark before calling ip_route_output(), this sounds more useful to me as users will be able to affect the hash calculation via the mark from other places in different ways. Cheers, Nik