Hi Colin,

thank you for your quick reply! :)

Quoting Colin Watson (2022-01-30 15:03:30)
> On Sun, Jan 30, 2022 at 02:27:05PM +0100, Johannes Schauer Marin Rodrigues 
> wrote:
> > currently, the index.db files created by man-db -c are unreproducible
> > when creating a Debian chroot. This means that tools that attempt to
> > create reproducible system images delete all index.db files:
> > 
> > https://gitlab.tails.boum.org/tails/tails/-/blob/stable/config/chroot_local-hooks/99-zzzzzz_reproducible-builds-post-processing#L28
> > https://salsa.debian.org/live-team/live-build/-/blob/master/share/hooks/normal/0190-remove-temporary-files.hook.chroot#L6
> > 
> > This could be avoided if the index.db files after installation would be
> > bit-by-bit reproducible. The attached patch fixes the problem by
> > truncating the timestamp set in index.db to the value of
> > SOURCE_DATE_EPOCH if the variable is set.
> > 
> > This means that this patch does not change anything during normal
> > operation but only comes into play if a utility that sets
> > SOURCE_DATE_EPOCH is installing packages.
> 
> I'm a bit confused, because this seems to work at the wrong layer.
> Debian packages are supposed to preserve timestamps from the source
> package wherever possible, and failing that it would be possible to
> ensure that the timestamp of generated manual pages in binary packages
> is set to SOURCE_DATE_EPOCH.  Flattening timestamps to an epoch at mandb
> time seems like the wrong place for this at first inspection, and I'd
> like some clearer rationale for why you ended up with this approach.
> 
> I would suggest instead ensuring that mtimes of manual pages are
> reproducible, after which mandb should produce reproducible databases
> (and if it doesn't I'd consider that a bug).
> 
> Deliberately setting database timestamps that don't match the filesystem
> will confuse mandb into doing unnecessary work in later runs, so I don't like
> this approach.

My reasoning was, that tools that care about reproducible index.db will
"flatten" the mtimes to SOURCE_DATE_EPOCH in the tarball or image they produce,
so setting the timestamp in index.db to SOURCE_DATE_EPOCH for those timestamps
larger than SOURCE_DATE_EPOCH seemed like the approach that would result in a
consistent overall state.

But if that's the wrong approach, lets think of the alternative: making sure
that the mtimes of manual pages is reproducible. If I use gdbm_dump on the
index.db of two different chroots, then it looks like the following manual pages
have differing timestamps:

bash-builtins, which, dash, mawk, pager, awk, sh, more, nawk, builtins

Most of those seem to be symlinks into /etc/alternatives and those symlinks get
created by maintainer scripts using update-alternatives. Are you suggesting
that update-alternatives should gain support for setting the mtime of the files
it creates to SOURCE_DATE_EPOCH?

I'm puzzled by bash-builtins though because that one is not a symlink. So I
don't understand why the timestamp differs there.

Thanks!

cheers, josch

Attachment: signature.asc
Description: signature

Reply via email to