Control: severity -1 wishlist Control: retitle -1 /var/lib/dbus/machine-id breaks reproducible builds of OS images
Retitling because this does not affect reproducibility *of packages* (as recommended in Policy ยง4.15), only reproducibility of whole systems (chroot/container/image). On Sun, 12 Sep 2021 at 03:04:27 +1000, Trent W. Buck wrote: > I am not sure how to fix this. > The references to machine-id in the dbus sources confused me. > It seems like sometimes it's a link, sometimes it's a symlink, sometimes it's > a copy. The problem here is that traditionally, merely installing the dbus package - without requiring a reboot or a specific init system - has been sufficient to get a fully-working D-Bus installation. One of the properties provided by a fully-working D-Bus installation is that there is a machine ID (in particular, dbus-launch(1) in the dbus-x11 package will not work otherwise, but in general, the authors of dbus consider a missing machine ID to be an incorrect and unsupported installation). The system bus starts as uid 0 and is able to set up a machine ID for itself, but the session bus and arbitrary user-defined buses (such as the one used for AT-SPI) are unprivileged and cannot generate a machine ID, so they have to rely on something "larger" (like systemd, or /etc/init.d/dbus, or dbus.postinst) to do that setup. If I was designing a message-bus system today, I wouldn't include a machine ID in it; but I don't get to choose the API guarantees that I've inherited from the original designers of D-Bus, and backwards compatibility is important to me. I can see why the machine ID was included, because it's there as a machine-oriented replacement for the hostname, which has two properties that make it undesirable: it's non-unique (lots of machines think their name is "debian" or "ubuntu" or "localhost"), breaking the desirable property that same hostname implies same machine; and it's human-meaningful, which means sysadmins sometimes want to change it for cosmetic or administrative reasons, breaking the desirable property that different hostname means different machine. Back when D-Bus was designed, NFS-shared home directories and remote X11 were considered to be essential-to-support, such that D-Bus would not have been adopted if it did not cope with those; but that means it needs a reliable way to identify machines among the multiple that can be sharing a home directory or an X11 display (and no, the hostname is not enough, for the reasons I mentioned above). Those use-cases are a lot less important now, and could perhaps even be considered to be deprecated, but the feature that was necessary to support them remains. /etc/machine-id is a generalization of the D-Bus machine ID, originating in systemd. There would be nothing to stop non-systemd machines from implementing it, but there is a tendency for people who dislike systemd to reject anything that came from systemd and work against its wider adoption unless there is absolutely no alternative, so it is not considered mandatory for Debian systems in general. As a result, /etc/machine-id is not guaranteed to exist unless/until the system has been booted successfully with systemd. If the system is to be used as a "plain" chroot, or a container that will be run without a full init system (as is conventional with Docker), or a machine that will boot with sysvinit or some other non-systemd init system, then there will usually be no /etc/machine-id. *If* the system is always going to be booted with systemd (and in particular for systemd-based live-images), then it's safe for /var/lib/dbus/machine-id to be deleted or replaced with a symlink to /etc/machine-id; but the dbus package's postinst cannot know whether this is the case. Even if systemd-sysv happens to be installed already, that's no guarantee that the system will not be used as a chroot with no real init system, in which case systemd will be present but dormant, and nothing will create /etc/machine-id. If /var/lib/dbus/machine-id is deleted, on systems that boot with systemd, the tmpfiles snippet /usr/lib/tmpfiles.d/dbus.conf will replace it with a symlink to /etc/machine-id; or on systems that boot with sysvinit, a call to dbus-uuidgen in /etc/init.d/dbus will regenerate it. However, this will not generally happen on non-systemd machines. When /var/lib/dbus/machine-id is generated by dbus-uuidgen in the dbus postinst or in /etc/init.d/dbus, if /etc/machine-id exists, dbus-uuidgen will copy it. In this case it is a copy, not a symlink, because dbus cannot guarantee that the file /etc/machine-id (which is not conceptually "owned" by dbus) will not get deleted out from under us. dbus could in principle create /etc/machine-id instead of /var/lib/dbus/machine-id, and make /var/lib/dbus/machine-id a symlink to it, but, again, dbus does not conceptually "own" /etc/machine-id, so this would create a risk that /etc/machine-id will be deleted by some other component, breaking the guarantees that dbus aims to provide. If you want mmdebstrap to generate reproducible images, then I think the best analogue to "echo uninitialized > /etc/machine-id" would be to delete /var/lib/dbus/machine-id, allowing it to be re-created during next boot. However, if there is no such thing as the "next boot" because the tree being bootstrapped is a chroot or a non-init-system container, that will result in an incomplete and partially non-functional D-Bus installation. Knowing whether this is an acceptable tradeoff requires more context than either dbus.postinst or mmdebstrap has available to them. I think the best solution to this might be to make /etc/machine-id part of the "specification" for what makes a Debian system, as an init-system-independent "API", similar to how we handle /run and /usr/lib/os-release, but I suspect that people who dislike systemd would oppose that as a point of principle, and I have higher priorities for how I want to spend the necessary emotional energy to make contentious things happen in Debian. > AFAICT there's no mention of dbus's machine-id supporting "uninitialized". That's because it doesn't. The special keyword "uninitialized" is a recently-added feature of systemd's handling of /etc/machine-id, which is newer than the D-Bus machine ID. *If* the system is booted with systemd, then /etc/machine-id will be replaced with a real machine ID during early boot, before the system has booted up far enough for D-Bus components to be running, so "uninitialized" will never be observable by D-Bus components. However, if the system is booted with a different init system, or if it is a chroot/container that is never "booted" at all, then that will not happen. smcv