On Mon, Jan 8, 2018 at 10:04 AM, Nish Aravamudan <[email protected]> wrote: > On Mon, Jan 8, 2018 at 9:51 AM, Nish Aravamudan > <[email protected]> wrote: >> On Mon, Jan 8, 2018 at 8:48 AM, Victor Tapia <[email protected]> >> wrote: >>> As mentioned by Mario @ #10, stopping corosync while pacemaker runs >>> throws the same error as the upgrade. Syslog from Xenial + >>> corosync=2.3.5-3ubuntu1: >>> >>> Jan 8 16:24:37 xenial-corosync systemd[1]: Stopping Pacemaker High >>> Availability Cluster Manager... >>> Jan 8 16:24:37 xenial-corosync pacemakerd[28747]: notice: Invoking >>> handler for signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Invoking handler for >>> signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: State transition >>> S_IDLE -> S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN >>> origin=crm_shutdown ] >>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Delaying fencing >>> operations until there are resources to manage >>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Scheduling Node >>> xenial-corosync for shutdown >>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Calculated >>> Transition 1: /var/lib/pacemaker/pengine/pe-input-52.bz2 >>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Transition 1 >>> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0, >>> Source=/var/lib/pacemaker/pengine/pe-input-52.bz2): Complete >>> Jan 8 16:24:37 xenial-corosync crmd[28753]: notice: Disconnecting from >>> Corosync >>> Jan 8 16:24:37 xenial-corosync cib[28748]: warning: >>> new_event_notification (28748-28753-12): Broken pipe (32) >>> Jan 8 16:24:37 xenial-corosync pengine[28752]: notice: Invoking handler >>> for signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync attrd[28751]: notice: Invoking handler >>> for signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync lrmd[28750]: notice: Invoking handler for >>> signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync stonith-ng[28749]: notice: Invoking >>> handler for signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Invoking handler for >>> signal 15: Terminated >>> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Disconnecting from >>> Corosync >>> Jan 8 16:24:37 xenial-corosync cib[28748]: notice: Disconnecting from >>> Corosync >>> Jan 8 16:24:37 xenial-corosync systemd[1]: Stopped Pacemaker High >>> Availability Cluster Manager. >>> >>> >>> Pacemakerd shuts down sending SIGTERM to its components, but after the >>> install, corosync does not start pacemaker. BTW, "systemctl restart >>> corosync" restarts both services perfectly >>> >>> I think that the option A from James Page (#11) is the way to go >> >> I took a quick look at a LXD container after seeing Felipe and >> Victor's posts. It seems like this is a bug in the xenial (at least) >> systemd unit files: >> >> # grep pacemaker /lib/systemd/system/corosync.service >> # pacemaker.service, and if you want to exert the watchdog when a >> >> # grep corosync /lib/systemd/system/pacemaker.service >> After=corosync.service >> Requires=corosync.service >> # ExecStopPost=/bin/sh -c 'pidof crmd || killall -TERM corosync' >> >> So, what I see is that corosync.service has no dependency on >> pacemaker.service (in the file). >> >> pacemaker.service will start after corosync.service. And when >> pacemaker.service is shutdown it will be before corosync.service. >> Additionally, if pacemaker.service is started, then corosync.service >> is started as well. >> >> Note, nothing specifies what Felipe said -- there is no guarantee that >> pacemaker is started, restarted, etc. when corosync is. >> >> I think the next step is to look at Bionic's systemd services >> (probably newer) or upstream's and see if there is a difference, or >> new dependencies added there. > > Or perhaps ask upstream what they think is providing this assurance in > their systemd files, because I'm not seeing it. > > If we have a hard dependency between pacemaker and corosync, then I > think we might need a PartOf directive, in order to ensure they are > always following the state transitions together.
Or if that is bad (because it does feel like a layering violation and maybe it makes sense to have either pacemaker or corosync installed with the other), the pacemaker.service should says WantedBy=corosync.service That will ensure that when corosync.service starts, pacemaker.service starts. The Requires line ensures that when corosync.service stops, pacemaker stops (with the order specified by the After). I think :) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1740892 Title: corosync upgrade on 2018-01-02 caused pacemaker to fail To manage notifications about this bug go to: https://bugs.launchpad.net/charm-hacluster/+bug/1740892/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
