Hi,

On 6/8/24 00:42, Simon McVittie wrote:

Having an UTF-8 locale available would be a good thing, but allowing
packages to rely on the active locale to be UTF-8 based reduces our testing
scope.

I'm not sure I follow. Are you suggesting that we should build each
package *n* times (in a UTF-8 locale, in a legacy locale, in locales
known to have unique quirks like Turkish and Japanese, ...), just for
its side-effect of *potentially* passing through those locales to the
upstream test suite?

To an extent, that is what reproducible-builds.org testing does. According to [1], they test LC_ALL unset vs LC_ALL set to one of et_EE.UTF-8, de_CH.UTF-8, nl_BE.UTF-8, or it_CH.UTF-8. They also test other locale variables.

What I'm concerned about is not so much tests inside packages behaving differently depending on locale, because that is an upstream problem.

Reproducibility outside of sterile environments is however a problem for us as a distribution, because it affects how well people are able to contribute to packages they are not directly maintaining -- so for me this hooks into the salsa discussion as well: if my package is not required to work outside of a very controlled environment, that is also an impediment to co-maintenance.

What we say is a bug, and what we say is not a bug, is a policy decision
about our scope: we support some things and we do not support others.

Exactly, and a lot of the debates we've had in the past years is who gets to decide what is in scope, and what is "legacy" code that should be excised to reduce the workload of the people driving change forward.

What Giole proposed at the beginning of this thread can be rephrased as
declaring that "FTBFS when locale is not C.UTF-8" and "non-reproducible
when locale is varied" are non-bugs, and therefore they are not only
wontfix, but they should be closed altogether as being out-of-scope.

Indeed -- however this class of bugs has already been solved because reproducible-builds.org have filed bugs wherever this happened, and maintainers have added workarounds where it was impossible to fix.

Turning this workaround into boilerplate code was a mistake already, so the answer to the complaint about having to copy boilerplate code that should be moved into the framework is "do not copy boilerplate code."

For locales and other facets of the execution environment that are
similarly easy to clear/reset/sanitize/normalize, we don't necessarily
need to be saying "if you do a build in this situation, you are doing
it wrong", because we could equally well be saying "if you do a build in
this situation, the build toolchain will automatically fix it for you" -
much more friendly towards anyone who is building packages interactively,
which seems to be the use-case that you're primarily interested in.

No, automatic fixes are absolutely not friendly -- these *add* to my mental load because I need to be aware of them if I want to understand what is happening. This is already annoying enough for LC_ALL=C.UTF-8 inside a package, but at least that usually happens in prominent enough places that I can find it, and it is part of the package.

Magic code inside the framework that performs an action automatically for me is extra debugging effort, because I need to either know the exact rules that the framework applies, or I need to debug into the framework.

   Simon

[1] https://tests.reproducible-builds.org/debian/index_variations.html

Attachment: OpenPGP_0xEBF67A846AABE354.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

Reply via email to