On Thu, Dec 26, 2019 at 3:32 AM Bruno Haible <br...@clisp.org> wrote: > > I'm thinking we should install the attached patch into Gnulib. The basic > > idea is > > that running these test scripts in random locales is likely more trouble > > than > > it's worth. > > No no no. Applied to grep's init.sh, it would reduce the test coverage of > grep. But applied to gnulib's init.sh, it would reduce the test coverage of > gettext, libunistring, coreutils, findutils, gzip, and many other packages. > I *vehemently* object against that. > > > the patch to grep that you > > proposed wouldn't suffice, because the test scripts have several other uses > > of > > printf with octal escapes outside of ASCII range, and they'd all have to be > > changed. > > Yeah, sure. But we have learned to write "LC_ALL=C tr ..." instead of "tr > ...". > We can also learn to write "LC_ALL=C printf ..." instead of "printf ...". > > Alternatively, you may define printf as a shell function in init.cfg. > > > Also, I worry that for platforms where printf is a builtin, "LC_ALL=C printf > > '\202'" won't work as POSIX requires because historically setting > > environment > > variables has been buggy for shell builtins. > > On the AIX 7.2 /bin/sh, I verified that 'LC_ALL=C printf ...' works, > On the other shells, printf of octals is known to be locale independent, > therefore it doesn't matter whether the LC_ALL=C assignment has an effect > on the printf command or not. > > > which cwould seem to require that the shell itself, not merely the printf > > command, be in a locale that is compatible with the byte sequence in > > question. > > If that is the case - which I doubt, because why would backquote expansion > or other things a shell does work differently in an UTF-8 locale -, the fix > would be to let init.sh respawn with a different shell. The respawning > condition > in init.sh lines 159..175 could be extended to include printf '\351'. > I've verified that the attached patch fixes the two reported tests from the > 'grep' test suite. > > Should I push that?
Thanks, Bruno. I too prefer to make init.sh eliminate that shell. Here's a slightly tighter test, albeit relying on tr working with octals: case `LC_ALL=en_US.UTF-8 printf '\351'| tr '\351' x` in x) ;; *) exit 1;; esac If you stick with od, please ensure that its output is precisely that one "e9" byte, e.g., via case $(LC_ALL=en_US.UTF-8 printf '\351'|od -An -tx1|tr -d ' ') in e9) ... Also, with archive/html URLs, please use the abbreviated form: - https://lists.gnu.org/archive/html/grep-devel/2019-12/msg00020.html + https://lists.gnu.org/r/grep-devel/2019-12/msg00020.html