Here at Mozilla, we have 200 servers running on HP Moonshot system, all have same hardware configuration and Ubuntu 16.04.2. The OS is not up to date, we use it as is was released. We using a program to tests Firefox source code and after each test we reboot the servers using /sbin/reboot. After a while (between 24-48h - during this period ~6 reboots/h are made), randomly, all 200 servers get stuck at the reboot - see the ILO capture - and to bring it back we have to power cycle each of them.
On one of the beta servers, we have made the bellow updates/changes, set debug, set cron to reboot server after 5-10 min, however, the reboot freeze is still present: - upgraded OS to Ubuntu 16.04.5 latest packages; - used GRUB_CMDLINE_LINUX_DEFAULT="reboot=bios" - used GRUB_CMDLINE_LINUX_DEFAULT="acpi=off" - GRUB_CMDLINE_LINUX_DEFAULT="reboot=force" - upgraded Kernel to v4.15 (the main one from Ubuntu's repo); - upgraded Kernel to v4.20 from https://kernel.ubuntu.com/~kernel-ppa/mainline/ - now we are testing the reboot with 4.20.3 from the above repo and working to update systemd. Attached you can find the debug-log for: - kernel 4.4.0-66-generic #87-Ubuntu - shutdown-debuglogkernel-4.4.txt - kernel 4.15 - shutdown-log-kernel4-15.txt - kernel 4.20 shutdown-log-kernel420.txt - ILO capture with the freeze ILO-reboot-freeze.PNG Please check all this logs/capture and let us know a solution. Thanks. ** Attachment added: "UbuntuBug.zip" https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1783499/+attachment/5230309/+files/UbuntuBug.zip -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to dbus in Ubuntu. https://bugs.launchpad.net/bugs/1783499 Title: systemd: Failed to send signal Status in dbus package in Ubuntu: Confirmed Status in systemd package in Ubuntu: Confirmed Bug description: systemd: Failed to send signal. [ 3.137257] systemd[1]: Failed to send job remove signal for 109: Connection reset by peer [ 3.138119] systemd[1]: run-rpc_pipefs.mount: Failed to send unit change signal for run-rpc_pipefs.mount: Transport endpoint is not connected [ 3.138185] systemd[1]: dev-mapper-ubuntu\x2d\x2dvg\x2droot.device: Failed to send unit change signal for dev-mapper-ubuntu\x2d\x2dvg\x2droot.device: Transport endpoint is not connected [ 3.138512] systemd[1]: run-rpc_pipefs.mount: Failed to send unit change signal for run-rpc_pipefs.mount: Transport endpoint is not connected [ 3.142719] systemd[1]: Failed to send job remove signal for 134: Transport endpoint is not connected [ 3.142958] systemd[1]: auth-rpcgss-module.service: Failed to send unit change signal for auth-rpcgss-module.service: Transport endpoint is not connected [ 3.165359] systemd[1]: Failed to send job remove signal for 133: Transport endpoint is not connected [ 3.165505] systemd[1]: proc-fs-nfsd.mount: Failed to send unit change signal for proc-fs-nfsd.mount: Transport endpoint is not connected [ 3.165541] systemd[1]: dev-mapper-ubuntu\x2d\x2dvg\x2droot.device: Failed to send unit change signal for dev-mapper-ubuntu\x2d\x2dvg\x2droot.device: Transport endpoint is not connected [ 3.166854] systemd[1]: Failed to send job remove signal for 66: Transport endpoint is not connected [ 3.167072] systemd[1]: proc-fs-nfsd.mount: Failed to send unit change signal for proc-fs-nfsd.mount: Transport endpoint is not connected [ 3.167130] systemd[1]: systemd-modules-load.service: Failed to send unit change signal for systemd-modules-load.service: Transport endpoint is not connected [ 2.929018] systemd[1]: Failed to send job remove signal for 53: Transport endpoint is not connected [ 2.929220] systemd[1]: systemd-random-seed.service: Failed to send unit change signal for systemd-random-seed.service: Transport endpoint is not connected [ 3.024320] systemd[1]: sys-devices-platform-serial8250-tty-ttyS12.device: Failed to send unit change signal for sys-devices-platform-serial8250-tty-ttyS12.device: Transport endpoint is not connected [ 3.024421] systemd[1]: dev-ttyS12.device: Failed to send unit change signal for dev-ttyS12.device: Transport endpoint is not connected [ 3.547019] systemd[1]: proc-sys-fs-binfmt_misc.automount: Failed to send unit change signal for proc-sys-fs-binfmt_misc.automount: Connection reset by peer [ 3.547144] systemd[1]: Failed to send job change signal for 207: Transport endpoint is not connected How to reproduce: 1. enable debug level journal LogLevel=debug in /etc/systemd/system.conf 2. reboot the system 3. journalctl | grep "Failed to send" sliu@vmlxhi-094:~$ lsb_release -rd Description: Ubuntu 16.04.4 LTS Release: 16.04 sliu@vmlxhi-094:~$ systemctl --version systemd 229 +PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN sliu@vmlxhi-094:~$ dbus-daemon --version D-Bus Message Bus Daemon 1.10.6 Copyright (C) 2002, 2003 Red Hat, Inc., CodeFactory AB, and others This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/dbus/+bug/1783499/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp