------- Comment From gpicc...@br.ibm.com 2016-08-24 12:34 EDT------- Thanks very much vorlon and pitti. Pretty nice findings!
But...the issue still persists. I'll summarize the tests I made: 1) Firstly, I changed the link /usr/bin/readlink and, as vorlon predicted, this didn't solve the issue. 2) Then, independently of (1), I applied pitti's patch to xenial's "73 -usb-net-by-mac.rules" and...unfortunately it also didn't solve the issue. What impress me more is the difficult/interference of the simplest debug on the issue! After testing pitti's patch, still with the patch applied, I changed the start-udev load like this: (before) SYSTEMD_LOG_LEVEL=notice /lib/systemd/systemd-udevd --daemon --resolve-names=never (after my change) SYSTEMD_LOG_LEVEL=debug /lib/systemd/systemd-udevd --daemon --resolve-names=never --debug Well, the issue reproduced and I didn't see a single extra log message. After this, I kept both pitti's patch and this systemd debug parameter, but I booted with command-line "BOOT_DEBUG=1". Guess what? I was flooded by messages but the installer showed up. This is really weird for me...I'll attach a screen session of this last trial. I appreciate any suggestion you have to debug the issue further - by the way, using "net.ifnames=1" workarounds the issue too. Basically, any command-line option seems to solve it, even the simplest debug parameter. Thanks very much for the help and advice, Guilherme -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1615021 Title: Unable to network boot Ubuntu 16.04 installer normally on Briggs Status in busybox package in Ubuntu: Fix Committed Status in debian-installer package in Ubuntu: Triaged Status in systemd package in Ubuntu: Fix Committed Status in busybox source package in Xenial: Won't Fix Status in debian-installer source package in Xenial: Triaged Status in systemd source package in Xenial: In Progress Status in busybox source package in Yakkety: Fix Committed Status in debian-installer source package in Yakkety: Triaged Status in systemd source package in Yakkety: Fix Committed Bug description: == Comment: #7 - Guilherme Guaglianoni Piccoli <gpicc...@br.ibm.com> - 2016-08-19 10:08:07 == The normal procedure to perform a Netboot installation of Ubuntu 16.04 is to download the latest vmlinux and initrd.gz files available, and kexec them with no parameters (at least in ppc64el). We're experiencing a strange issue in which the installer freezes before menus are showed. The system hangs in the point specified below, right after the i40e driver initialization: [ 11.052832] i40e 0002:01:00.0 enP2p1s0f0: renamed from eth0 [ 11.073976] i40e 0002:01:00.1 enP2p1s0f1: renamed from eth1 [ 11.117799] i40e 0002:01:00.2 enP2p1s0f2: renamed from eth2 [ 11.225745] i40e 0002:01:00.3 enP2p1s0f3: renamed from eth3 ***HANG*** The most difficult part in this issue is that it seems to be a timing issue/race condition, and many debug trials end up by avoiding the issue reproduction (heisenbug). We were successful though in getting logs by booting the kernel with the command-line "BOOT_DEBUG=2" and by changing the initrd in order to enable systemd debug; only the files "init" and "start-udev" were changed in initrd, both attached here. We've attached here a saved screen session that shows the entire boot process until it gets flooded with lots of messages like: "starting '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/ udev/rules.d/80-net-setup-link.rules': No such file or directory' seq 3244 queued, 'add' 'pci_bus' starting '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' passed 408 byte device to netlink monitor 0x1003cfe8020seq 3236 running'/bin/readlink /etc/udev/rules.d/80-net-setup-l ink.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules': No such file or directory' '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules'(err) 'failed to execute '/bin/readlink' '/bin/readlink /etc/ udev/rules.d/80-net-setup-link.rules': No such file or directory' Process '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' failed with exit code 2. PROGRAM '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' /lib/udev/rules.d/73-usb-net-by-mac.rules:6 passed device to netlink monitor 0x1003d01f730 " Then it keeps hanged in this stage. We re-tested it by changing the file 73-usb-net-by-mac.rules in initrd, replacing " /etc/udev/rules.d/80-net-setup-link.rules" to "/lib/udev/rules.d/80 -net-setup-link.rules", since the former does not exist whereas the latter does. Same issue were observed! Notice that if we boot the installer with command-line "net.ifnames=0" or "net.ifnames=1", the problem does not reproduces anymore. We want to ask Canonical's help in investigating this issue. Thanks, Guilherme SRU INFORMATION for systemd =========================== Test case: * Check what happens for uevents on devices which are not USB network interfaces: udevadm test /sys/devices/virtual/mem/null udevadm test /sys/class/net/lo With the current version these will run PROGRAM '/bin/readlink /etc/udev/rules.d/80-net-setup-link.rules' /lib/udev/rules.d/73-usb-net-by-mac.rules:6 which is pointless. With the proposed version these should be gone. * Ensure that the rule still works as intended by connecting an USB network device that has a permanent MAC address (e. g. Android tethering uses a temporary MAC): You should get a MAC-based name like "enx12345678" for it. Now disconnect it again, disable ifnames with sudo ln -s /dev/null /etc/udev/rules.d/80-net-setup-link.rules and reconnect the device. You should now get a kernel name like "usb0" for it. * Regression potential: Errors in the rule could break persistent naming - or its disabling - of USB network interfaces. Running the above test carefully is important to ensure this keeps working. This has little to no actual effect on anything else on the system (aside from a performance impact and spamming logs), so overall the regression potential is low. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/busybox/+bug/1615021/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp