On 03/02/17 23:24, Stephen Hemminger wrote: > On Fri, 3 Feb 2017 19:34:19 +0100 > Nikolay Aleksandrov <niko...@cumulusnetworks.com> wrote: > >> On 03/02/17 19:28, Stephen Hemminger wrote: >>> On Fri, 3 Feb 2017 09:30:37 +0100 >>> Nikolay Aleksandrov <niko...@cumulusnetworks.com> wrote: >>> >>>> On 03/02/17 03:47, David Miller wrote: >>>>> From: Nikolay Aleksandrov <niko...@cumulusnetworks.com> >>>>> Date: Tue, 31 Jan 2017 16:31:58 +0100 >>>>> >>>>>> @@ -197,7 +197,8 @@ int br_handle_frame_finish(struct net *net, struct >>>>>> sock *sk, struct sk_buff *skb >>>>>> if (dst->is_local) >>>>>> return br_pass_frame_up(skb); >>>>>> >>>>>> - dst->used = jiffies; >>>>>> + if (br->used_enabled) >>>>>> + dst->used = jiffies; >>>>> >>>>> Have you tried: >>>>> >>>>> if (dst->used != jiffies) >>>>> dst->used = jiffies; >>>>> >>>>> If that isn't effective, you can tweak the test to decrease the >>>>> granularity of the value. Basically, if dst->used is within >>>>> 1 HZ of jiffies, don't do the write. >>>>> >>>>> I suspect this might help a lot, and not require a new bridging >>>>> option. >>>>> >>>> >>>> Yes, I actually have a patch titled "used granularity". :-) I've tested >>>> with different >>>> values and it does help but it either needs to be paired with another >>>> similar test for >>>> the "updated" field (since they share a write-heavy cache line) or they >>>> need to be >>>> in separate cache lines to avoid that dst's source port from causing the >>>> load HitM for >>>> all who check the value. >>>> >>>> I'll run some more tests and probably go this way for now. >>>> >>>> Thanks, >>>> Nik >>>> >>> >>> Since used doesn't need HZ granularity, it reports values in clock_t >>> resolution so >>> storing (and doing cmp and set would mean that it would only be 100 HZ >>> >> >> Yes, exactly what I'm currently testing. Will post the new set soon. >> Since HZ can be different a generic way to obtain the granularity for >> both should be clock_t_to_jiffies(1) if I'm not missing something. >> >> > > USER_HZ is set by userspace ABI to 100 hz. HZ is configurable when kernel is > built. >
Yes, the point I was trying to make is that we want to take the number of jiffies we can skip by converting 1 clock_t to X jiffies because the user-space granularity is clock_t and HZ can change, thus clock_t_to_jiffies(1) should give us the number of updates we can skip for "used" and "updated". By "both" I meant "used" and "updated" fields, not HZ and USER_HZ.