Package: dpkg-dev Version: 1.18.12 According to POSIX, the meaning of glob patterns is unspecified in locales other than `the POSIX locale'. [1] It's not easy to see from the spec, but the relevant env var is LC_COLLATE.
Many package build rules, dh rules, etc., rely on shell globbing. This shell globbing needs to be predictable. The output of a package build ought not to depend on the locale at all, really. (This is one of the things that the reproducible builds people are trying to ensure.) But we don't want to set LC_MESSAGES, at least, because we want people to be able to debug builds in their native language, as far as possible. It is difficult to imagine a situation where a honouring a user's LC_COLLATE during a package build would be beneficial. In practice, nonstandard LC_COLLATE values can break perfectly sensible looking build code. For example, chiark-utils 5.0.0+exp1 FTBFS in current stretch when LC_COLLATE=fr_CH.UTF-8 because of this: $ touch 11 pp qq $ LC_COLLATE=fr_CH.UTF-8 bash -c 'echo [!A-Z]*[!~]' 11 $ (Interestingly, many of these FTBFS problems will be hidden if /bin/sh is dash, because dash does not honour locales for globbing. This is clearly legal according to the spec, and probably a good decision.) In principle this bug might be fixable by asking (almost) every package to set LC_COLLATE in debian/rules. But ISTM that it would be much better to fix this in dpkg-buildpackage. I suggest that dpkg-buildpackage should do as follows: * Unconditionally set one of the following LC_COLLATE=C.UTF-8 LC_COLLATE=C Colin Watson tells me that C.UTF-8 has been in libc since approximately squeeze. C is theoretically UB (!) for high-bit set octets but in practice works just fine (and it would be intolerable if it didn't). * Check the effective LC_COLLATE using locale(1), and produce a warning if the result is not m/^C(?=\.|$)/. (This is useful because some misguided user might set LC_ALL.) In the meantime the reproducible builds folks may want to consider explicitly setting LC_COLLATE to something sane in their 2nd build. Thanks for your attention. Regards, Ian. [1] Shell path glob patterns are mostly like normal glob patterns: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_03 Glob patterns' bracketed [] character sets are mostly like regexp ones: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_13_02 Regexp bracketed character sets with ranges depend on locale. Point 7 of: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05 -- Ian Jackson <ijack...@chiark.greenend.org.uk> These opinions are my own. If I emailed you from an address @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.