@hopem thanks for your nice reply and the complete overview of the situation.
I do understand the issue with exception handling and propagation between privsep and the reader. As one cannot catch all exceptions or erroneous conditions that systems might reach, a major improvement would be to consider possible ways to reconcile in this and also other situations: 1) If the setup of any of the various components (veth interfaces, routes, iptables, ...) fails, switch away from being the keepalived master giving any other node the chance to actually take over 2) If a node is the master but things failed retry to set things up once more To avoid excessive retries certainly an exponential back-off needs to be applied to retries, but the state of a node being the HA router master, but then not being ready to service traffic must not remain. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1927868 Title: vRouter not working after update to 16.3.1 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1927868/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs