** Description changed:
- The bug occurs when the use-stale-cache feature is enabled alongside
- DNSSEC validation.
+ [ Impact ]
- When a query response is served from cache, dnsmasq immediately returns
- it to the client. However, this bypasses the normal retry mechanism that
- dnsmasq's DNSSEC implementation depends on. Specifically, dnsmasq
- expects clients to retry truncated DNSSEC queries over TCP.
+ * Enabling 'use-stale-cache' alongside DNSSEC validation can cause repeated
+ validation attempts for every cached response if the original DNSKEY
+ response is truncated.
+
+ * dnsmasq normally expects clients to retry truncated queries over TCP, but
+ 'use-stale-cache' immediately serves a stale response to the client, never
+ triggering a TCP retry. Consequently, DNSSEC validation continuously fails.
+
+ * The upstream commit [1] fixes this issue by internally moving to TCP for
+ DNSSEC queries when validating UDP answers.
+
+ [1]:
+ https://github.com/imp/dnsmasq/commit/f5cdb007d8845dde8e5053229f47b46b1b756473
+
+ [ Test Plan ]
+
+ * Execute the following script:
+
+ ```
+ dnsport=5353
+ logfile=$(mktemp)
+
+ trap '
+ kill $(jobs -p) 2>/dev/null
+ wait $(jobs -p) 2>/dev/null
+ ' EXIT INT TERM
+
+ dnsmasq \
+ --no-daemon \
+ --port $dnsport \
+ --server=8.8.8.8 \
+ --no-resolv \
+ --log-queries \
+ --log-facility=- \
+ --use-stale-cache \
+ --dnssec \
+ --max-cache-ttl=1 \
+ --edns-packet-max=512 \
+
--trust-anchor=.,20326,8,2,E06D44B80B8F1D39A95C0B0D7C65D08458E880409BBC683457104237C7F8EC8D
\
+ 2>&1 | tee $logfile &
+
+ sleep 2
+
+ echo
+ echo '1. Without TCP retry (+ignore): DNSSEC validation FAILS'
+ dig -p $dnsport @127.0.0.1 cloudflare.com +ignore 2>&1 | grep -E
'status:|Truncated'
+ grep 'validation result' $logfile | head -1
+
+ echo
+ echo '2. With TCP retry: validation succeeds'
+ dig -p $dnsport @127.0.0.1 cloudflare.com 2>&1 | grep -E 'status:|Truncated'
+ grep 'SECURE' $logfile | head -1
+
+ echo
+ echo '3. From cache: returns instantly (0ms), background refresh has no TCP
retry'
+ sleep 3
+ dig -p $dnsport @127.0.0.1 cloudflare.com 2>&1 | grep -E 'Query time|EDE'
+ grep -E 'cached-stale|forwarded' $logfile | tail -2
+
+ echo "full dnsmasq log in $logfile"
+ ```
+
+ * The most important section is the step "1. Without TCP retry (+ignore)
...".
+ The validation result should be SECURE (and not TRUNCATED).
+
+ [ Where problems could occur ]
+
+ * The patch is not trivial and had to be backported to both Jammy and Noble
+ since it didn't apply cleanly.
+
+ * Since the patch introduces automatically initiated TCP connections,
+ regressions would likely manifest in TCP connection handling/network
timeout
+ handling.
+
+ * The new internal TCP retries could theoretically hang or time out if an
+ upstream DNS server or intermediate firewall blocks TCP.
+
+ [ Other Info ]
+
+ * Fix already included in upstream versions >= 2.91.
+
+ [ Original Bug Description ]
+ The bug occurs when the use-stale-cache feature is enabled alongside DNSSEC
+ validation.
+
+ When a query response is served from cache, dnsmasq immediately returns it to
+ the client. However, this bypasses the normal retry mechanism that dnsmasq's
+ DNSSEC implementation depends on. Specifically, dnsmasq expects clients to
+ retry truncated DNSSEC queries over TCP.
When background cache refresh requires DNSSEC validation and the DNSKEY
- response exceeds 1232 bytes (which is typical for the root DNSKEY), the
- query is truncated. Since the client never retries, validation fails.
- This triggers repeated validation attempts for every cached response. It
- has resulted in fleet-wide query storm that can persist for up to 48
- hours (the TTL of the root DNSKEY).
+ response exceeds 1232 bytes (which is typical for the root DNSKEY), the query
+ is truncated. Since the client never retries, validation fails. This triggers
+ repeated validation attempts for every cached response. It has resulted in
+ fleet-wide query storm that can persist for up to 48 hours (the TTL of the
root
+ DNSKEY).
- This issue is fixed in upstream commit f5cdb00, which performs the TCP
- retry internally without requiring the client to trigger it. This fix is
- included in dnsmasq 2.91 but is not present in version 2.90 currently
- available in Jammy and Focal repositories.
+ This issue is fixed in upstream commit f5cdb00, which performs the TCP retry
+ internally without requiring the client to trigger it. This fix is included in
+ dnsmasq 2.91 but is not present in version 2.90 currently available in Jammy
+ and Focal repositories.
** Changed in: dnsmasq (Ubuntu Noble)
Status: New => In Progress
** Changed in: dnsmasq (Ubuntu Jammy)
Status: New => In Progress
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2138412
Title:
DNSSEC validation with stale cache enabled does not properly retry
truncated response
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dnsmasq/+bug/2138412/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs