I filed two issues for Blackbox on Github, one for exposing at least the
'tc' flag state as a metric and one for allowing you to have Blackbox
set an EDNS increased size (which is supported by the underlying Go DNS
library Blackbox uses). I didn't file an issue for UDP to TCP fallback
because I suspect that this is out of scope for Blackbox and anyway it
raises design questions of, for example, how the metrics should work
(since on a fallback Blackbox is now making two DNS requests).

For any interested parties, these are:
https://github.com/prometheus/blackbox_exporter/issues/1258
https://github.com/prometheus/blackbox_exporter/issues/1259

        - cks

> Thanks for the detailed post. Sounds like a feature request/bug report. I
> would file an issue on GitHub, this should be easily solved.
>
> https://github.com/prometheus/blackbox_exporter/issues
>
> On Wed, Jun 26, 2024 at 12:19 AM Chris Siebenmann <
> [email protected]> wrote:
>
> > To make a long story short, we've been having mysterious probe failures
> > with one of our Blackbox DNS probes against (only) some DNS servers that
> > turned out to be because Blackbox UDP DNS probes have a 512-byte limit
> > on the size of the reply, because Blackbox doesn't currently set EDNS
> > options to increase the allowed reply size and doesn't fall back to a
> > TCP query if the UDP query fails because of truncation. We think this
> > was partially due to these DNS servers using DNS cookies, which
> > increases the reply size.
> >
> > (Our DNS probe checks not just for a successful reply but that the query
> > resolved to at least one A record, so some of the time the reply could
> > be long enough that the truncated version didn't include any of the A
> > records.)
> >
> > Right now the only way to know for sure that your DNS query failed
> > because of truncation is to examine Blackbox probe logs, usually through
> > its web interface (but you can manually query with '..&debug=true'), and
> > notice that one of the log messages reports something like 'flags: qr tc
> > rd ra;' (the 'tc' is the important bit). If you are sure you know how
> > many resource records should in the various sections of the DNS replies,
> > you can check if the probe got the right number of RRs using the
> > probe_dns_*_rrs metrics.
> >
> > For DNS servers that accept TCP connections, you can work around this by
> > switching your Blackbox DNS module to using TCP instead of the (default)
> > UDP.
> >
> > (I suspect that most people will never run into this, but for our sins
> > we check some external DNS names that have long CNAME chains and other
> > fun things.)
> >
> >         - cks

-- 
You received this message because you are subscribed to the Google Groups 
"Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/prometheus-users/2625310.1719407827%40apps0.cs.toronto.edu.

Reply via email to