I've cherry-picked the upstream patches and built the package in my bug-
fixes PPA:

https://launchpad.net/~tj/+archive/ubuntu/bugfixes

Verified it solves the issue even in the face of a 1000ms delay being
imposed by the router using:

## example traffic control to slow down UDP port 53 traffic from a
specific upstream DNS server being forwarded by router for egress from
the LOCAL bridge device.

# tc qdisc add dev LOCAL root handle 1:0 prio
# tc qdisc add dev LOCAL parent 1:2 handle 10: netem delay 1000ms
# tc filter add dev LOCAL protocol ipv6 parent 1: prio 1 u32 match ip6 src 
fddc:7e00:e001:ee00::1/64 match ip6 sport 53 0xffff flowid 10:1
# tc filter add dev LOCAL protocol ipv6 parent 1: prio 1 u32 match ip6 dst 
fddc:7e00:e001:ee00::1/64 match ip6 dport 53 0xffff flowid 10:1

tc -s qdisc ls dev LOCAL
qdisc prio 1: root refcnt 2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
 Sent 4643351 bytes 7676 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 138b 1p requeues 0
qdisc netem 10: parent 1:2 limit 1000 delay 1s
 Sent 2682417 bytes 3245 pkt (dropped 0, overlimits 0 requeues 0) 
 backlog 138b 1p requeues 0


## prio[rity] creates 3 bands (classes :1 :2 :3) by default. 
Interactive/immediate packets (UDP 53 DNS) should have Type Of Service (TOS 
0x1000) set in the IP packet header by the resolvers. Default priomap puts 
those packets in the 2nd band (:2 for Interactive/Minimise delay). The netem 
delay qdisc is attached to $parent:2 with handle 10: (major:minor - minor 
defaults to 0). u32 (unsigned 32-bit) filters that match the UDP port 53 
traffic direct it to the handle of the netem qdisc (flowid 10:1 - :1 being the 
first leaf) where a 300ms delay is imposed.


# tcpdump  -vvvni enp2s0 "(ip6 and port 53) or (icmp6[icmp6type] = 1 and 
icmp6[icmp6code] = 4)"
... 
21:01:49.232778 IP6 (flowlabel 0xc8a82, hlim 64, next-header UDP (17) payload 
length: 56) fddc:7e00:e001:ee00:fa75:a4ff:fef3:42b4.59484 > fddc:7e0
0:e001:ee00::1.53: [bad udp cksum 0x7528 -> 0x9b42!] 25832+ [1au] AAAA? 
packages.ubuntu.com. ar: . OPT UDPsize=512 (48)                           
21:01:49.232862 IP6 (flowlabel 0x9137e, hlim 64, next-header UDP (17) payload 
length: 56) fddc:7e00:e001:ee00:fa75:a4ff:fef3:42b4.43177 > fddc:7e0
0:e001:ee00::1.53: [bad udp cksum 0x7528 -> 0x5114!] 61129+ [1au] AAAA? 
packages.ubuntu.com. ar: . OPT UDPsize=512 (48)                           
21:01:49.319885 IP6 (flowlabel 0x5decb, hlim 63, next-header UDP (17) payload 
length: 84) fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fa75:a4f
f:fef3:42b4.43177: [udp sum ok] 61129 q: AAAA? packages.ubuntu.com. 1/0/1 
packages.ubuntu.com. [10m] AAAA 2a01:7e00:e001:ee64::5bbd:5e25 ar: . OPT
 UDPsize=1232 (76)                                                              
                                                                  
21:01:49.319920 IP6 (flowlabel 0x45773, hlim 63, next-header UDP (17) payload 
length: 84) fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fa75:a4f
f:fef3:42b4.59484: [udp sum ok] 25832 q: AAAA? packages.ubuntu.com. 1/0/1 
packages.ubuntu.com. [10m] AAAA 2a01:7e00:e001:ee64::5bbd:5e25 ar: . OPT
 UDPsize=1232 (76)

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1940908

Title:
  resolved: closes listening socket too rapidly and sends Destination
  port unreachable

Status in systemd package in Ubuntu:
  Incomplete

Bug description:
  Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248)

  With systemd v245 (and v247) and systemd-resolved we're seeing
  frequent problems due to resolved rapidly closing the socket on which
  it sends out a query before the server has answered. The server
  answers and then resolved sends an ICMP Destination Unreachable (Port
  Unreachable) response!

  This breaks name lookups frequently. In our case the DNS server is
  reached via a Wireguard tunnel over a satellite link and latencies can
  vary.

  A typical example captured via tcpdump:

  07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > 
fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? 
contile-images.services.mozilla.com. (64)
  07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > 
fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 
2a01:7e00:e001:ee64::2278:7366 (92)
  07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > 
fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, 
fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148

  The time difference here is only 0.054170 and there is no way to alter
  the timeout in resolved.

  There are recent upstream commits to fix this which ought to be
  cherry-picked. See:

  https://github.com/systemd/systemd/issues/17421

  https://github.com/systemd/systemd/pull/17535

  
https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88

  If I am reading the code correctly the timeout is very short:

  src/resolve/resolved-dns-transaction.c:22:#define DNS_TIMEOUT_USEC
  (SD_RESOLVED_QUERY_TIMEOUT_USEC / DNS_TRANSACTION_ATTEMPTS_MAX)

  src/resolve/resolved-def.h:79:#define SD_RESOLVED_QUERY_TIMEOUT_USEC
  (120 * USEC_PER_SEC)

  src/resolve/resolved-dns-transaction.h:212:#define
  DNS_TRANSACTION_ATTEMPTS_MAX 24

  So in micro-seconds that is 120 /24 = 5 per query with, as inferred,
  up to 24 attempts (I don't see multiple duplicate requests on the wire
  so not sure DNS_TRANSACTION_ATTEMPTS_MAX affects this.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1940908/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to