On Tuesday, April 6, 2021 10:04 AM, Remi Gacogne wrote:
> On 4/6/21 4:18 PM, Nejedlo, Mark via Pdns-users wrote:
> > Would additional distributor threads really cause additional worker
> CPU usage?
> 
> That could happen if they have to fight for the incoming socket. Do you
> have reuseport=yes in your configuration?

Either I'm deeply misunderstanding something (quite possible), or we may be 
talking about different things.  If I understand correctly, thundering herd 
problems should only show up on the distributor threads, but my distributors 
are not very busy.  It is the workers doing the actual DNS processing that show 
the high CPU, and I would think the distributors address the workers 
individually, not via a shared port (but maybe I'm wrong?)

Dropping the distributors to one, which I'm planning to do anyway, will 
eliminate the problem on the front end socket.  If the workers do share the 
connection to the distributors, adding reuseport isn't hard.

> > Does the maintenance function block the worker while it's running?
> 
> Yes.

That's unfortunate.  While I don't think timeouts are the source of my 
problems, I'll still have to think about how to address this just for general 
health of the service.

> >> I see that XPF is enabled between dnsdist and the recursor, which
> likely
> >> kills the recursor's packet cache. That might explain the bad
> >> performance results.
> >
> > Even with a short edns-subnet-whitelist?
> 
> I'm afraid so, yes, and since some of your responses depends on the
> client IP (EDNS Client Subnet is enabled for some domains) you can't
> really enable the packet cache in dnsdist, unless you know for sure that
> only a few domains are using EDNS Client Subnet, and that there is no
> CNAME to them from other domains. Then you could perhaps enable the
> packet cache in dnsdist and disable it for these domains only.

Sounds like I'll need to do some more detailed learning about the packet caches 
and how they interact.

> Do you really XPF, by the way? You are passing the initial client IP in
> EDNS Client Subnet already, so that might be enough?

This is probably a misunderstanding on my part.  I was under the impression 
that useClientSubnet=true told dnsdist that it needed to pass the client IP, 
and addXPF/proxy protocol told it how to do so.  If I'm wrong, dropping XPF is 
easy enough.  Although, it sounds like I also want to drop useClientSubnet in 
favor of the proxy protocol.

> > Both 4.4/5 and proxy protocol were on my radar, but my priority was to
> address the CPU usage.  If there's performance gains to be had in
> upgrading, I can certainly do that.  Is 4.5GA likely to happen soon?
> 
> The proxy protocol adds a header outside of the DNS payload, so it would
> not kill your packet cache. If you get rid of EDNS Client Subnet and XPF
> between dnsdist and the recursor so you should get much better
> performance.
> Even if you need to keep EDNS Client Subnet between dnsdist and the
> recursor, you could then try enabling dnsdist's packet cache with the
> EDNS zero scope feature [1] which let dnsdist know when it can cache an
> answer for all clients.

This probably goes back to my confusion on the previous point.  I need client 
IP aware responses, not specifically useClientSubnet between dnsdist and 
pdns_recursor.  Proxy protocol should be fine.

Thanks,
Mark 

_______________________________________________
Pdns-users mailing list
Pdns-users@mailman.powerdns.com
https://mailman.powerdns.com/mailman/listinfo/pdns-users

Reply via email to