On 6/11/20 6:32 PM, Yi Yang (杨燚)-云服务集团 wrote: > David, thank you so much for confirming it can't, I did read your cumulus > document before, resilient hashing is ok for next hop remove, but it still > has the same issue there if add new next hop. I know most of kernel code in > Cumulus Linux has been in upstream kernel, I'm wondering why you didn't push > resilient hashing to upstream kernel. > > I think consistent hashing is must-have for a commercial load balancing > solution, otherwise it is basically nonsense , do you Cumulus Linux have > consistent hashing solution? > > Is "- replacing nexthop entries as LB's come and go" ithe stuff > https://docs.cumulusnetworks.com/cumulus-linux/Layer-3/Equal-Cost-Multipath-Load-Sharing-Hardware-ECMP/#resilient-hashing > is showing? It can't ensure the flow is distributed to the right backend > server if a new next hop is added.
I do not believe it is a problem to be solved in the kernel. If you follow the *intent* of the Cumulus document: what is the maximum number of load balancers you expect to have? 16? 32? 64? Define an ECMP route with that number of nexthops and fill in the weighting that meets your needs. When an LB is added or removed, you decide what the new set of paths is that maintains N-total paths with the distribution that meets your needs. I just sent patches for active-backup nexthops that allows an automatic fallback when one is removed to address the redistribution problem, but it still requires userspace to decide what the active-backup pairs are as well as the maximum number of paths.