On 2018-06-05, Stuart Longland <[email protected]> wrote: > On 05/06/18 06:46, Stuart Henderson wrote: >> On 2018-06-04, Stuart Longland <[email protected]> wrote: >>> My thinking, since the problem has disappeared, is that the sheer number >>> of clients was overwhelming the router, and as a result, it didn't have >>> enough buffer space to handle the number of separate hosts requesting >>> the time from it. >> >> Oh! It might have been PF state table exhaustion. By default a maximum >> of 10000 states are allowed (can be overridden with a different value in >> pf.conf). >> >> Has it been rebooted since the last time you saw the problem? If not, >> pfctl -si might still have some clues in the counters. > > Unfortunately yes, a few times. Is there a maximum limit on the number > of states? I later found that option and bumped it to 40000, but I'm > not certain on what the maximum is. > > I'm guessing it'll be a function how big a "state" is, and how much > memory I'm willing to dedicate to `pf`. This machine isn't doing much > else but routing, so I can afford to throw quite a bit of memory (and > CPU) at it.
You can check how much memory a state uses, see the pfstate line in "systat pool" (you can cursor-down through the list). You can at least get an idea of what fits in syatem memory. I think there are some limits to kernel memory use beyond the actual amount of memory in the system but am not sure of the details of that (and I think it varies between machine arches). It might not make sense to keep state for this NTP traffic anyway, you could try this near the top of the ruleset: pass in quick proto udp to self port ntp no state pass out quick proto udp from self port ntp no state This would be a cpu vs memory tradeoff. (For each packet it's doing a state lookup first anyway, then passes through the ruleset, so "no state" would increase cpu use a bit - putting it near the top and using "quick" keeps this as low as possible).

