On 31/01/17 19:21, Stephen Hemminger wrote: > On Tue, 31 Jan 2017 19:09:09 +0100 > Nikolay Aleksandrov <niko...@cumulusnetworks.com> wrote: > >> On 31/01/17 17:41, Nikolay Aleksandrov wrote: >>>> >>>> I agree with the first 3 patches, but not the last one. >>>> Changing the API just for a performance hack is not necessary. Instead make >>>> the algorithm smarter and use per-cpu values. >>>> >>> >>> Thanks for the feedback, I would very much prefer any of the other two >>> approaches >>> I tried (per-cpu pool and per-cpu for each fdb), from the two the second >>> one - >>> per-cpu for each fdb is much simpler, so would it be acceptable to do >>> per-cpu allocation >>> for each fdb ? >>> >>> >>> >> >> Okay, after some more testing the version with per-cpu per-fdb allocations, >> at 300 000 fdb entries >> I got 120 failed per-cpu allocs which seems okay. I'll wait a little more >> and will repost the series >> with per-cpu allocations and without the RFC tag. >> >> Thanks, >> Nik >> > > You could also use a mark/sweep algorithm (rather than recording updated). > It turns out that clearing is fast (can be unlocked). > The timer workqueue can mark all fdb entries (during scan), then in forward > function clear the bit if it is set. This would turn writes into reads.
The wq doesn't have a strict next call, it is floating depending on the soonest expire, this can cause issues as we don't know when last we've reset the bit and using the scan interval resolution will result in big offsets when purging entries. > > To keep the API for last used, just change the resolution to be scan interval. > With default 300 second resolution ? People will be angry. :-) Also this has to happen for both "updated" and "used", they're both causing trouble. In fact "used" is much worse than "updated", because it's written to by all who transmit to that fdb. Actually to start we can do something much simpler - just always update "used" at most once per 1/10 of ageing_time for example. The default case would give us an update every 30 seconds if the fdb is actually used or we can cap it at 10 seconds. The "updated" we move to its own cache line and with proper config (bind ports to CPUs) it will be fine. What do you think ?