> On Aug 26, 2015, at 9:57 PM, roopa <ro...@cumulusnetworks.com> wrote:
> 
> On 8/26/15, 4:33 AM, Nikolay Aleksandrov wrote:
>>> On Aug 25, 2015, at 11:06 PM, David Miller <da...@davemloft.net> wrote:
>>> 
>>> From: Nikolay Aleksandrov <niko...@cumulusnetworks.com>
>>> Date: Tue, 25 Aug 2015 22:28:16 -0700
>>> 
>>>> Certainly, that should be done and I will look into it, but the
>>>> essence of this patch is a bit different. The problem here is not
>>>> the size of the fdb entries, it’s more the number of them - having
>>>> 96000 entries (even if they were 1 byte ones) is just way too much
>>>> especially when the fdb hash size is small and static. We could work
>>>> on making it dynamic though, but still these type of local entries
>>>> per vlan per port can easily be avoided with this option.
>>> 96000 bits can be stored in 12k.  Get where I'm going with this?
>>> 
>>> Look at the problem sideways.
>> Oh okay, I misunderstood your previous comment. I’ll look into that.
>> 
> I just wanted to add the other problems we have had with keeping these macs 
> (mostly from userspace POV):
> - add/del netlink notification storms
> - and large netlink dumps
> 
> In addition to in-kernel optimizations, will be nice to have a solution that 
> reduces the burden on userspace. That will need a newer netlink dump format 
> for fdbs. Considering all the changes needed, Nikolays patch seems less 
> intrusive.

Right, we need to take these into account as well. I’ll continue the discussion 
on this (or restart it) because
I looked into using a bitmap for the local entries only and while it fixes the 
scalability issue, it presents
a few new ones which are mostly related to the fact that these entries now 
exist only without a vlan
and if a new mac comes along which matches one of these but is in a vlan, the 
entry will get created
in br_fdb_update() unless we add a second lookup, but that will slow down the 
learning path.
Also this change requires an update of every fdb function that uses the vid as 
a key (every fdb function?!)
because now we can have the mac in two places instead of one which is a pretty 
big churn with lots
of conditionals all over the place and I don’t like it. Adding this complexity 
for the local addresses only
seems like an overkill, so I think to drop this issue for now.
This patch (that works around the initial problem) also has these issues.
Note that one way to take care of this in a more straight-forward way would be 
to have each entry
with some sort of a bitmap (like Vlad has tried earlier) and then we can 
combine the paths so most
of these issues disappear, but that will not be easy as was already commented 
earlier. I’ve looked
briefly into doing this with rhashtable so we can keep the memory footprint for 
each entry relatively
small but it still affects the performance and we can have thousands of resizes 
happening. 

On the notification side if we can fix that, we can actually delete the 96000 
entries without creating a
huge notification storm and do a user-land workaround of the original issue, so 
I’ll look into that next.

Any comments or ideas are very welcome.

Thank you,
 Nik

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to