From cloud-init point of view the solution now implemented make sense: to run it before the apt-daily-upgrade. However, I wanted to add that there are other use cases as well such as SSM documents being executed on instances. These can be executed in batch at any time and may also require installation of packages and thus interfere with these unattended upgrades.
The execution of documents is not linked directly to cloud-init and may be ran after the instances has been booted, so this falls in the other category of having some kind queuing system or at least a centralized way to obtain a lock to be able to use apt. At the moment there are dozens of different possibilities how to get a mutex to be able to execute apt, but somehow we couldn't find a bullet proof way that works *every time*. So maybe this does not really fit into this ticket, but to address that this is only a partial fix to a bigger problem. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to apt in Ubuntu. https://bugs.launchpad.net/bugs/1693361 Title: cloud-init sometimes fails on dpkg lock due to concurrent apt- daily.service execution Status in APT: Fix Released Status in cloud-init: Fix Released Status in apt package in Ubuntu: Invalid Status in cloud-init package in Ubuntu: Fix Released Status in cloud-init source package in Xenial: Fix Released Status in cloud-init source package in Yakkety: Won't Fix Status in cloud-init source package in Zesty: Fix Released Status in cloud-init source package in Artful: Fix Released Bug description: === Begin SRU Template === [Impact] A cloud-config that contains packages to install (see below) or 'package_upgrade' will run 'apt-get update'. That can sometimes fail as a result of contention with the apt-daily.service that updates that information. Cloud-config showing the problem is just like: $ cat my.yaml #cloud-config packages: ['hello'] [Test Case] lxc-proposed-snapshot is https://git.launchpad.net/~smoser/cloud-init/+git/sru-info/tree/bin/lxc-proposed-snapshot It publishes an image to lxd with proposed enabled and cloud-init upgraded. a.) launch an instance with proposed version of cloud-init and some user-data. This is platform independent. The test case demonstrates lxd. $ printf "%s\n%s\n%s\n" "#cloud-config" "packages: ['hello']" \ "package_upgrade: true" > config.yaml $ release=xenial $ ref=proposed-$release $ ./lxc-proposed-snapshot --proposed --publish $release $ref; b.) start the instance $ name=$release-1693361 $ lxc launch my-xenial "--config=user.user-data=$(cat config.yaml) $ sleep 1 $ lxc exec $name -- tail -f /var/log/cloud-init.log /var/log/cloud-init-output.log # watch this boot. c.) Look for evidence of systemd failure journalctl -o short-precise | grep -i break journalctl -o short-precise | grep -i order [Regression Potential] Regression chance here is low. Its possible that ordering loops could occur. When that does happen, journalctl will mention it. Unfortunately in such cases systemd somewhat randomly picks a service to kil so behavior is somewhat undefined. [Other Info] Upstream commit at https://git.launchpad.net/cloud-init/commit/?id=11121fe4 === End SRU Template === apt-daily is now a systemd service rather than being invoked by cron.daily. If one builds a custom AMI it is possible that the apt- daily.timer will fire during boot. This can fire at the same time cloud-init is running and if cloud-init loses the race the invocation of apt (e.g. use of "packages:" in the config) will fail. There is a lot of discussion online about this change to apt-daily (e.g. unattended upgrades happening during business hours, delaying boot, etc.) and discussion of potential systemd changes regarding timers firing during boot (c.f. https://github.com/systemd/systemd/issues/5659). While it would be better to solve this in apt itself, I suggest that cloud-init be defensive when calling apt and implement some retry mechanism. Various instances of people running into this issue: https://github.com/chef/bento/issues/609 https://clusterhq.atlassian.net/browse/FLOC-4486 https://github.com/boxcutter/ubuntu/issues/73 https://unix.stackexchange.com/questions/315502/how-to-disable-apt-daily-service-on-ubuntu-cloud-vm-image To manage notifications about this bug go to: https://bugs.launchpad.net/apt/+bug/1693361/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp