Re: [PROPOSAL] time zones and reproducible builds (was: How to make groff use local timezone?)

Colin Watson Wed, 23 Dec 2020 03:59:02 -0800

On Tue, Dec 22, 2020 at 01:29:44PM +0000, Deri wrote:
> Please can someone explain why reproducible builds are important.


Hi,

It's probably best to simply refer you to
https://reproducible-builds.org/ for general motivation, and then you
can come back with any questions you have.

> What is the output of groff we should be testing.

Without loss of generality: the files that end up being distributed in
packaged form to users.

> Since these are essentially source files which are intended to be run
> at some point, diffing them just tells us that there has been a change
> in grops or gropdf, not whether that output, when run, has changed.

It's not a problem if a change in grops or gropdf (or whatever) induces
a change in the output: this is to be expected for almost any software.
The point is rather that you should be able to install the same versions
of the various bits of software involved in the build toolchain,
construct a suitably-documented build environment, and get bit-for-bit
identical output.  Whether the software involved is groff or TeX or gcc
or an artisanally-crafted pile of Python or whatever is immaterial: if
you can reproduce a build and produce bit-for-bit identical output, then
that helps to assure that the build infrastructure that produced the
binary packages you're using is sound.

For example, if somebody has replaced gropdf on some bit of build
infrastructure with gropdf-but-insert-evil-attack, then that can be
noticed quite easily if gropdf would ordinarily produce bit-for-bit
identical results across multiple runs.  But if gropdf inserts extra
information from its environment into the output, then the problem
becomes more difficult: now you have to work out how to filter out the
"expected" differences, and that problem is compounded if what you're
looking at isn't a pair of PDF files but rather a pair of .debs or RPMs
or MSI files or whatever that contain some PDFs somewhere inside them.
Bear in mind that this is the sort of problem that people want to tackle
in bulk at the scale of a whole software distribution, not at the level
of comparing individual rendered PDF files by hand.

Now, there is absolutely room for debate and compromise on exactly what
sorts of environmental constraints one needs to apply when reproducing a
build, hence things like
https://reproducible-builds.org/docs/source-date-epoch/ and working out
to what extent timezones should be taken into consideration.  As I
mentioned I'm certainly open to the possibility that when I patched
Debian's groff to in some sense force TZ=UTC I did so at the wrong
layer.  But I hope this explains why at least the principle is
important.

-- 
Colin Watson (he/him)                              [cjwat...@debian.org]

Re: [PROPOSAL] time zones and reproducible builds (was: How to make groff use local timezone?)

Reply via email to