Package: cloud.debian.org Severity: major User: cloud.debian....@packages.debian.org Usertags: aws
Problem: Production systems in AWS lose all network connectivity after 1h, after a dist-upgrade from Debian 9 to Debian 10 has been performed. One can't ssh in to investigate and no remote console exists in AWS. Fortunately, you *can* restart the EC2 instance, which will generate a new dhcp lease and give you another 1h of access before the access is cut again. How to reproduce: Install a Debian 9 machine using the official Debian 9 AMI. During the hardening of the machine, disable IPv6 completely: # cat /etc/sysctl.d/disable_ipv6.conf net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.eth0.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1 This hardened Debian 9 server works perfectly for a year. Now perform a dist-upgrade to Debian 10. Everything looks good. No errors during the upgrade. After the final reboot, the server comes online as it should. BUT... After 1 hour we suddenly lose all access to the server. A reset of the EC2 brings the access back, only to be lost again 1h later. (unfortunately, neither dhclient nor the cloud-init scripts syslogged any error, so it was pretty hard to figure out what was wrong) It turns out to be the IPv6 hardening that generates problems for dhclient/ifup. I believe the problem lies in /sbin/dhclient-script : if [ -n "$old_ip_address" ] && [ "$old_ip_address" != "$new_ip_address" ]; then # leased IP has changed => flush it ip -4 addr flush dev ${interface} label ${interface} fi My guess is that when dhclient fails to set an IPv6 IP, the above code flushes the current IPv4 configured on the machine, making it lose all network connectivity. My current workaround is to *not* do the above IPv6 hardening, then the server works fine. My /etc/network/interfaces configuration: # interfaces(5) file used by ifup(8) and ifdown(8) # Include files from /etc/network/interfaces.d: source-directory /etc/network/interfaces.d auto lo iface lo inet loopback auto eth0 iface eth0 inet dhcp allow-hotplug eth0 iface eth0 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth1 inet dhcp allow-hotplug eth1 iface eth1 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth2 inet dhcp allow-hotplug eth2 iface eth2 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth3 inet dhcp allow-hotplug eth3 iface eth3 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth4 inet dhcp allow-hotplug eth4 iface eth4 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth5 inet dhcp allow-hotplug eth5 iface eth5 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth6 inet dhcp allow-hotplug eth6 iface eth6 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth7 inet dhcp allow-hotplug eth7 iface eth7 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper iface eth8 inet dhcp allow-hotplug eth8 iface eth8 inet6 manual up /usr/local/sbin/inet6-ifup-helper down /usr/local/sbin/inet6-ifup-helper Log: Jul 8 10:13:36 foobar ifup[363]: RTNETLINK answers: File exists Jul 8 10:13:36 foobar ifup[363]: invoke-rc.d: could not determine current runlevel Jul 8 10:13:36 foobar dhclient[571]: bound to 10.75.75.75 -- renewal in 1491 seconds. Jul 8 10:13:36 foobar ifup[363]: bound to 10.75.75.75 -- renewal in 1491 seconds. Jul 8 10:13:36 foobar ifup[363]: Could not get a link-local address Jul 8 10:13:36 foobar ifup[363]: ifup: failed to bring up eth0 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000 link/ether 06:ce:43:75:75:75 brd ff:ff:ff:ff:ff:ff Additional findings: If I compare the contents of the dir /etc/network/ of this 9-->10 dist-upgraded machine, it differs from a machine that is installed directly with the Debian 10 AMI: dist-upgraded:/etc/network> ls if-down.d/ if-post-down.d/ if-pre-up.d/ if-up.d/ interfaces interfaces.d/ pure deb10:/etc/network> ls cloud-ifupdown-helper* if-down.d/ if-pre-up.d/ interfaces cloud-interfaces-template if-post-down.d/ if-up.d/ interfaces.d/ This makes me think that the cloud-init package for Debian 10 does something wrong. Somewhat related bug: #846583 /Martin