The verification of the Stable Release Update for systemd has completed
successfully and the package is now being released to -updates.
Subsequently, the Ubuntu Stable Release Updates Team is being
unsubscribed and will not receive messages about this bug report.  In
the event that you encounter a regression using the package from
-updates please report a new bug using ubuntu-bug and tag the bug report
regression-update so we can easily find any regressions.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2003250

Title:
  networkctl reload with bond devices causes slaves to go DOWN and UP,
  causing couple of seconds of network loss

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Jammy:
  Fix Released
Status in systemd source package in Kinetic:
  Won't Fix

Bug description:
  [SRU TEMPLATE]

  [DESCRIPTION]

  We currently use Ubuntu 22.04.1 LTS including updates for our production 
cloud (switched from legacy Centos 7).
  Although we like the distribution we recently hit serious systemd buggy 
behavior described in [1] bugreport using packages [2].

  Unfortunatelly the clouds we are running consist of openstack on top
  of kubernetes and we need to have complex network configuration
  including linux bond devices.

  Our observation is that every time we apply our configuration via
  CI/CD infrastructure using ansible and netplan (regardless whether
  there is actual network configuration change) we see approximatelly
  8-16 seconds network interruptions and see bond interfaces going DOWN
  and then UP.

  We expect bond interfaces stay UP when there is no network
  configuration change.

  We went though couple of options how to solve the issue and the first
  one is to add such existing patch [3] into current
  systemd-249.11-0ubuntu3.6.

  Could you comment whether this kind of non-security patch is likely to land 
in 22.04.1 LTS soon.
  We are able to help to bring patch into systemd package community way if you 
suggest the steps.

  [TESTING]

  On a Jammy system, create a bond interface with two subordinate
  devices. Assuming the interfaces ens3 and ens9 exist on the system,
  this can be done using the following:

  $ cat > /etc/netplan/bond.yaml << EOF
  network:
    version: 2
    renderer: networkd
    ethernets:
      ens3:
        dhcp4: no
      ens9:
        dhcp4: no
    bonds:
      bond0:
        dhcp4: yes
        interfaces:
          - ens3
          - ens9
        parameters:
          mode: active-backup
          primary: ens3
  EOF

  $ netplan generate && netplan apply

  From here, there are two tests that can be used to verify the fix.

  1. Update the modification time of the generated network files, and
  call networkctl reload. From networkctl(1), when "reload" is called:

  [...] If a new, modified or removed .network file is found, then all
  interfaces which match the file are reconfigured.

  Hence, the following will trigger the desired code path:

  $ touch /run/systemd/network/*
  $ networkctl reload

  Without the fix, you can see in the logs the interfaces of the bond
  going up and down. With the fix, this should not happen.

  $ journalctl -b -u systemd-networkd.service --grep="Link DOWN"

  Finally, check that everything is back in the configured state:

  $ networkctl status

  2. This bug can also be triggered by calling networkctl reconfigure
  directly.

  $ networkctl reconfigure ens3
  $ networkctl reconfigure ens9

  Check the logs that the links were not brought down:

  $ journalctl -b -u systemd-networkd.service --grep="Link DOWN"

  Finally, check that everything is back in the configured state:

  $ networkctl status

  [REGRESSION POTENTIAL]

  This patch is confined to the SET_LINK_MASTER logic for configuring
  links in systemd-networkd. While bond interfaces are the motivation
  for the fix, this early return applies for all interface types which
  SET_LINK_MASTER is supported, e.g. bridge interfaces as well.

  This logic has seen exercise in newer releases of systemd and Ubuntu
  without further modification, so I would not expect to see regressions
  for other interface types. Furthermore, the bond type is the only type
  where the link is set to down in order to configure the master
  interface index, so this call was already effectively a no-op for
  those other interface types.

  If any problems did occur, it would be related to (re-)configuring
  link types which have a master interface set.

  [OTHER]

  This fix requires two upstream patches:

  https://github.com/systemd/systemd/commit/9f913d37a0
  https://github.com/systemd/systemd/commit/c3e12de0a6

  The second is a follow-up to the first, to complete the fix.

  These patches do not apply cleanly to v249, so some trivial conflicts
  were resolved to make the patches apply. Additionally, some additional
  logic is added to the patches so that the link state is correctly set
  when this new branch is hit.

  Specifically, we decrement the set_link_messages counter, and call
  link_check_ready() before returning -EALREADY. This is necessary
  because the version of systemd where these patches originate from saw
  a lot of refactoring in this area of systemd-networkd since v249. So,
  while in newer versions of systemd, the message counter is handled
  correctly, and link_check_ready() is eventually called despite
  cancelling the SET_LINK_MASTER request, this never happens when these
  patches are applied to v249. Hence, we add the necessary steps to the
  patch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/2003250/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to