Re: desync: scheduling fib reload

Robert Blacquiere Mon, 30 Oct 2017 05:02:19 -0700

Hi Theo,

On Sun, Oct 29, 2017 at 11:45:54AM -0600, Theo de Raadt wrote:
> 
> Yes, on the route socket.  It is unreasonable for the kernel to
> maintain an infinite number of route change messages, so about 9 years
> ago we developed this scheme of marking the situation for userland to
> handle.  Such a mechanism didn't exist before, because noone had run
> into the concern before -- people weren't turning *BSD systems into
> full-table/high-churn routing systems before our daemons came along.


Thanks for explaining. 
> 
> > We have changed default sysctl settings for: 
> > kern.maxcluster=24576 
> > net.inet.ip.ifq.maxlen=4096
> > net.inet6.ip6.ifq.maxlen=1024
> > 
> > as from netstat -m  we ran out of 2048 mbufs at defaults. 
> 
> Come on, think for a second.  See "ip" and "ip6"?  That doesn't grow
> the queue on the routing socket.  If anything it probably makes
> your situation worse.

The ip and ip6 were the first things I changed to help drops on
interfaces. That has worked, we have now no dropped traffic. And yes I
know that does not help with the ospf issue. 

> 
> As for growing the size of the route socket buffer -- it is unclear
> whether that won't make the situation worse.  When a desync is
> detected in userland, you will already have read and consumed the full
> queue -- which now has a gap in it, and requires a fresh restart.  So
> you are promising to do MORE wasteful work before recovering.
> 
> Anyways, there are two circumstances where it happens: route buffer limits,
> or temporary mbuf shortage.  I think you've hit the latter.

How can I fix this temporary mbuf shortage? I have been searching how to
detect this. From netstat -m output:

$ netstat -m       
956 mbufs in use:
        933 mbufs allocated to data
        14 mbufs allocated to packet headers
        9 mbufs allocated to socket names and addresses
930/13264/24576 mbuf 2048 byte clusters in use (current/peak/max)
0/8/24576 mbuf 4096 byte clusters in use (current/peak/max)
0/8/24576 mbuf 8192 byte clusters in use (current/peak/max)
0/14/24584 mbuf 9216 byte clusters in use (current/peak/max)
0/10/24580 mbuf 12288 byte clusters in use (current/peak/max)
0/8/24576 mbuf 16384 byte clusters in use (current/peak/max)
0/8/24576 mbuf 65536 byte clusters in use (current/peak/max)
3768 Kbytes allocated to network (55% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines


We hit max 2048 mbuf clusters so i bumped the kern.maxcluster.  

Does anybody know how to attack this issue, I have been searching how to
debug this potential mbuf shortage correctly but apparently went the
wrong way to fix this. 

Regards

Robert

Re: desync: scheduling fib reload

Reply via email to