Public bug reported: Ubuntu 22.04.3 LTS systemd 249.11-0ubuntu3.12
systemd issue tracker says this version is too old to report upstream and I should report to downstream bug tracker. IPv6 default routes are getting lost and not renewed. We're using IPv6 RA to find default routes for our servers and desktops. The RAs come from HP/Aruba routers and have a short lifetime of about 46s. Occasionally, we will see the default routes get dropped. Despite receiving RAs, the default routes don't get recreated. The most recent machine to be affected had a user running an excessively large job (load average 157). This is the state of the network when the machine is working: ```sh # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 2c:ea:7f:56:9a:66 brd ff:ff:ff:ff:ff:ff altname enp4s0f0 3: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 2c:ea:7f:56:9a:66 brd ff:ff:ff:ff:ff:ff permaddr 2c:ea:7f:56:9a:67 altname enp4s0f1 4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 2c:ea:7f:56:9a:66 brd ff:ff:ff:ff:ff:ff inet xxx.xxx.202.112/24 brd 129.215.202.255 scope global bond0 valid_lft forever preferred_lft forever inet6 xxxx:xxx:xxx:202:2eea:7fff:fe56:9a66/64 scope global dynamic mngtmpaddr noprefixroute valid_lft 2591994sec preferred_lft 604794sec inet6 fe80::2eea:7fff:fe56:9a66/64 scope link valid_lft forever preferred_lft forever # ip -6 r ::1 dev lo proto kernel metric 256 pref medium xxxx:xxx:xxx:202::/64 dev bond0 proto ra metric 1024 expires 2591998sec pref medium fe80::/64 dev bond0 proto kernel metric 256 pref medium default proto ra metric 1024 expires 28sec pref medium nexthop via fe80::609:73ff:fe48:c000 dev bond0 weight 1 nexthop via fe80::609:73ff:fe48:6500 dev bond0 weight 1 ``` When the problem arises, the last three lines disappear. `tcpdump icmp6` shows RAs being received but networkd doesn't create the routes in the kernel. The machine keeps its IPv6 addresses, but without a default route it can't make any IPv6 connections or answer incoming IPv6 connections. Sorry, reproduction method is unclear. Here's a best guess: 1. Configure networkd using netplan: ```yaml --- network: bonds: bond0: addresses: - xxx.xxx.202.112/24 dhcp4: false interfaces: - eth0 - eth1 macaddress: 2C:EA:7F:56:9A:66 parameters: mii-monitor-interval: 1 mode: active-backup ethernets: eth0: dhcp4: false match: macaddress: 2C:EA:7F:56:9A:66 eth1: dhcp4: false match: macaddress: 2C:EA:7F:56:9A:67 renderer: networkd version: 2 ``` 2. Load the machine, or just wait. Possibly this is related to packets being dropped, but I would expect the system to recover once the load is removed. 3. Note the lack of IPv6 connectivity, inability to log in with ssh, etc. ** Affects: systemd (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/2053288 Title: systemd-networkd IPv6 default routes dropped under load, don't recover Status in systemd package in Ubuntu: New Bug description: Ubuntu 22.04.3 LTS systemd 249.11-0ubuntu3.12 systemd issue tracker says this version is too old to report upstream and I should report to downstream bug tracker. IPv6 default routes are getting lost and not renewed. We're using IPv6 RA to find default routes for our servers and desktops. The RAs come from HP/Aruba routers and have a short lifetime of about 46s. Occasionally, we will see the default routes get dropped. Despite receiving RAs, the default routes don't get recreated. The most recent machine to be affected had a user running an excessively large job (load average 157). This is the state of the network when the machine is working: ```sh # ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 2c:ea:7f:56:9a:66 brd ff:ff:ff:ff:ff:ff altname enp4s0f0 3: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000 link/ether 2c:ea:7f:56:9a:66 brd ff:ff:ff:ff:ff:ff permaddr 2c:ea:7f:56:9a:67 altname enp4s0f1 4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 2c:ea:7f:56:9a:66 brd ff:ff:ff:ff:ff:ff inet xxx.xxx.202.112/24 brd 129.215.202.255 scope global bond0 valid_lft forever preferred_lft forever inet6 xxxx:xxx:xxx:202:2eea:7fff:fe56:9a66/64 scope global dynamic mngtmpaddr noprefixroute valid_lft 2591994sec preferred_lft 604794sec inet6 fe80::2eea:7fff:fe56:9a66/64 scope link valid_lft forever preferred_lft forever # ip -6 r ::1 dev lo proto kernel metric 256 pref medium xxxx:xxx:xxx:202::/64 dev bond0 proto ra metric 1024 expires 2591998sec pref medium fe80::/64 dev bond0 proto kernel metric 256 pref medium default proto ra metric 1024 expires 28sec pref medium nexthop via fe80::609:73ff:fe48:c000 dev bond0 weight 1 nexthop via fe80::609:73ff:fe48:6500 dev bond0 weight 1 ``` When the problem arises, the last three lines disappear. `tcpdump icmp6` shows RAs being received but networkd doesn't create the routes in the kernel. The machine keeps its IPv6 addresses, but without a default route it can't make any IPv6 connections or answer incoming IPv6 connections. Sorry, reproduction method is unclear. Here's a best guess: 1. Configure networkd using netplan: ```yaml --- network: bonds: bond0: addresses: - xxx.xxx.202.112/24 dhcp4: false interfaces: - eth0 - eth1 macaddress: 2C:EA:7F:56:9A:66 parameters: mii-monitor-interval: 1 mode: active-backup ethernets: eth0: dhcp4: false match: macaddress: 2C:EA:7F:56:9A:66 eth1: dhcp4: false match: macaddress: 2C:EA:7F:56:9A:67 renderer: networkd version: 2 ``` 2. Load the machine, or just wait. Possibly this is related to packets being dropped, but I would expect the system to recover once the load is removed. 3. Note the lack of IPv6 connectivity, inability to log in with ssh, etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2053288/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp