I configured unbound running on my LAN router (alix) to forward all requests over TLS to a bunch of places with:
forward-zone: name: "." forward-tls-upstream: yes # Quad9 forward-addr: 2620:fe::fe@853#dns.quad9.net forward-addr: 9.9.9.9@853#dns.quad9.net forward-addr: 2620:fe::9@853#dns.quad9.net forward-addr: 149.112.112.112@853#dns.quad9.net # Cloudflare DNS forward-addr: 2606:4700:4700::1111@853#cloudflare-dns.com forward-addr: 1.1.1.1@853#cloudflare-dns.com forward-addr: 2606:4700:4700::1001@853#cloudflare-dns.com forward-addr: 1.0.0.1@853#cloudflare-dns.com # Google DNS forward-addr: 8.8.8.8@853#dns.google forward-addr: 8.8.4.4@853#dns.google forward-addr: 2001:4860:4860::8888@853#dns.google forward-addr: 2001:4860:4860::8844@853#dns.google This setup worked for months until recently becoming temperamental. It would run for a few hours: Mar 8 23:17:00 alix unbound: [40947:0] info: start of service (unbound 1.13.0). ... until starting to fail with: Mar 9 00:53:18 alix unbound: [40947:0] error: SERVFAIL <wirelessdevicestats.googleapis.com. AAAA IN>: all the configured stub or forward servers failed, at zone . ... which goes on for hours until I wake my desktop machine (lenny). I know this doesn't sound related, but I observed this multiple times and see no better correlation. The two machines are connected to the two jacks on the cable modem and see each other on the same ethernet network. alix serves as the DHCP server. Mar 8 23:39:50 lenny slaacd[71358]: sendmsg: Device not configured Mar 8 23:39:50 lenny /bsd: ugen1 detached Mar 8 23:39:51 lenny /bsd: ugen2 detached Mar 9 06:50:17 lenny /bsd: uhub0 detached Mar 9 06:50:17 lenny /bsd: uhub1 detached Mar 9 06:50:17 lenny /bsd: uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 addr 1 Mar 9 06:50:17 lenny /bsd: iwx0: acquiring device failed Mar 9 06:50:17 lenny /bsd: uhub1 at usb1 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 addr 1 Mar 9 06:50:17 lenny /bsd: uhub2 at uhub0 port 2 configuration 1 interface 0 "Terminus Technology USB 2.0 Hub" rev 2.00/1.11 addr 2 Mar 9 06:50:18 lenny /bsd: ure0 at uhub2 port 1 configuration 1 interface 0 "CMI USB 10/100/1000 LAN" rev 2.10/31.00 addr 3 Mar 9 06:50:18 lenny /bsd: ure0: RTL8153B (0x6010), address 70:88:6b:8b:df:c2 Mar 9 06:50:18 lenny /bsd: rgephy0 at ure0 phy 0: RTL8251 PHY, rev. 0 Mar 9 06:50:19 lenny apmd: system resumed from sleep At which point unbound recovers: Mar 9 06:50:10 alix ntpd[94516]: DNS lookup tempfail Mar 9 06:50:17 alix unbound: [40947:0] error: SERVFAIL <www.gstatic.com. A IN>: all the configured stub or forward servers failed, at zone . Mar 9 06:50:17 alix unbound: [40947:0] error: SERVFAIL <connectivitycheck.gstatic.com. A IN>: all the configured stub or forward servers failed, at zone . Mar 9 06:50:18 alix unbound: [40947:0] error: SERVFAIL <home-devices.googleapis.com. A IN>: all the configured stub or forward servers failed, at zone . Mar 9 06:50:18 alix unbound: [40947:0] error: SERVFAIL <home-devices.googleapis.com. AAAA IN>: all the configured stub or forward servers failed, at zone . This is a first sign of ubound recovery, these were previously failing with DNS lookup tempfail above: Mar 9 06:50:42 alix ntpd[94516]: peer 5.79.108.34 now valid Mar 9 06:50:44 alix ntpd[94516]: peer 188.40.62.55 now valid Mar 9 06:50:46 alix ntpd[94516]: peer 185.41.243.43 now valid Mar 9 06:50:47 alix ntpd[94516]: peer 85.25.68.31 now valid Mar 9 06:51:00 alix ntpd[94516]: peer 2606:4700:f1::1 now valid ... note the delay, the first logged interaction from lenny is here because I manually run dhclient: Mar 9 06:51:39 alix dhcpd[56370]: DHCPREQUEST for 10.1.10.128 from 70:88:6b:8b:df:c2 via vr1 Mar 9 06:51:39 alix dhcpd[56370]: DHCPACK on 10.1.10.128 to 70:88:6b:8b:df:c2 via vr1 Mar 9 06:51:39 alix dhcpd[56370]: DHCPREQUEST for 10.1.10.128 from 70:88:6b:8b:df:c2 via vr1 Mar 9 06:51:39 alix dhcpd[56370]: DHCPACK on 10.1.10.128 to 70:88:6b:8b:df:c2 via vr1 Mar 9 06:52:09 alix ntpd[94516]: clock is now synced Mar 9 06:52:10 alix ntpd[94516]: constraint reply from 172.217.5.100: offset -0.480105 Mar 9 06:52:10 alix ntpd[94516]: constraint reply from 9.9.9.9: offset -0.809646 Mar 9 06:52:11 alix ntpd[94516]: constraint reply from 2607:f8b0:4005:80a::2004: offset -0.080415 Mar 9 06:52:11 alix ntpd[94516]: constraint reply from 2620:fe::fe: offset -0.146772 This first started with unbound 1.11 running on 6.8-stable. I noticed there's persistent TLS connection support in unbound 1.13 and so I upgraded to a snaphost yesterday. The problem remained, suggesting something else is to blame. I'm looking for clues about debugging unbound. I am attaching my minimally sanitized unbound.conf Thanks Greg
server: interface: 10.1.10.53 do-ip6: no access-control: 0.0.0.0/0 refuse access-control: 127.0.0.0/8 allow access-control: 10.1.10.0/24 allow hide-identity: yes hide-version: yes tls-cert-bundle: "/etc/ssl/cert.pem" # Weird place Verizon phones go to. local-zone: "wo.vzwwo.com" refuse # Full of 'local-zone: "..." refuse' include: /var/unbound/db/unbound-adhosts.conf # A few more 'local-zone: "..." refuse' include: /var/unbound/db/unbound-games.conf use-syslog: yes log-queries: no log-servfail: yes remote-control: control-enable: yes control-interface: /var/run/unbound.sock forward-zone: name: "." forward-tls-upstream: yes # Quad9 forward-addr: 2620:fe::fe@853#dns.quad9.net forward-addr: 9.9.9.9@853#dns.quad9.net forward-addr: 2620:fe::9@853#dns.quad9.net forward-addr: 149.112.112.112@853#dns.quad9.net # Cloudflare DNS forward-addr: 2606:4700:4700::1111@853#cloudflare-dns.com forward-addr: 1.1.1.1@853#cloudflare-dns.com forward-addr: 2606:4700:4700::1001@853#cloudflare-dns.com forward-addr: 1.0.0.1@853#cloudflare-dns.com # Google DNS forward-addr: 8.8.8.8@853#dns.google forward-addr: 8.8.4.4@853#dns.google forward-addr: 2001:4860:4860::8888@853#dns.google forward-addr: 2001:4860:4860::8844@853#dns.google