Hi Colin, thank you for your quick reply! :)
Quoting Colin Watson (2022-01-30 15:03:30) > On Sun, Jan 30, 2022 at 02:27:05PM +0100, Johannes Schauer Marin Rodrigues > wrote: > > currently, the index.db files created by man-db -c are unreproducible > > when creating a Debian chroot. This means that tools that attempt to > > create reproducible system images delete all index.db files: > > > > https://gitlab.tails.boum.org/tails/tails/-/blob/stable/config/chroot_local-hooks/99-zzzzzz_reproducible-builds-post-processing#L28 > > https://salsa.debian.org/live-team/live-build/-/blob/master/share/hooks/normal/0190-remove-temporary-files.hook.chroot#L6 > > > > This could be avoided if the index.db files after installation would be > > bit-by-bit reproducible. The attached patch fixes the problem by > > truncating the timestamp set in index.db to the value of > > SOURCE_DATE_EPOCH if the variable is set. > > > > This means that this patch does not change anything during normal > > operation but only comes into play if a utility that sets > > SOURCE_DATE_EPOCH is installing packages. > > I'm a bit confused, because this seems to work at the wrong layer. > Debian packages are supposed to preserve timestamps from the source > package wherever possible, and failing that it would be possible to > ensure that the timestamp of generated manual pages in binary packages > is set to SOURCE_DATE_EPOCH. Flattening timestamps to an epoch at mandb > time seems like the wrong place for this at first inspection, and I'd > like some clearer rationale for why you ended up with this approach. > > I would suggest instead ensuring that mtimes of manual pages are > reproducible, after which mandb should produce reproducible databases > (and if it doesn't I'd consider that a bug). > > Deliberately setting database timestamps that don't match the filesystem > will confuse mandb into doing unnecessary work in later runs, so I don't like > this approach. My reasoning was, that tools that care about reproducible index.db will "flatten" the mtimes to SOURCE_DATE_EPOCH in the tarball or image they produce, so setting the timestamp in index.db to SOURCE_DATE_EPOCH for those timestamps larger than SOURCE_DATE_EPOCH seemed like the approach that would result in a consistent overall state. But if that's the wrong approach, lets think of the alternative: making sure that the mtimes of manual pages is reproducible. If I use gdbm_dump on the index.db of two different chroots, then it looks like the following manual pages have differing timestamps: bash-builtins, which, dash, mawk, pager, awk, sh, more, nawk, builtins Most of those seem to be symlinks into /etc/alternatives and those symlinks get created by maintainer scripts using update-alternatives. Are you suggesting that update-alternatives should gain support for setting the mtime of the files it creates to SOURCE_DATE_EPOCH? I'm puzzled by bash-builtins though because that one is not a symlink. So I don't understand why the timestamp differs there. Thanks! cheers, josch
signature.asc
Description: signature