Il 27/06/2013 14:11, Johannes Meixner ha scritto: > > Hello, > > On Jun 27 10:48 Paolo Bonzini wrote (excerpt): >> Il 27/06/2013 09:33, Aharon Robbins ha scritto: >>> >>> Fortunately, gawk and grep are already there, and I think the sed in >>> the git repo is as well. Once Bash turns this on as default, the >>> world will definitely be a better place, independent of GLIBC. >> >> I already explained this multiple times how this is completely >> delusional. >> >> 1) grep, sed, coreutils and so on will only use representation-based >> range interpretation (I prefer this more neutral term that also explains >> what's going on) if you use gnulib's regex implementation. And by >> default, they use glibc (I just checked grep). >> >> 2) Even if you switched the default, you would be at the mercy of >> distros. Distros prefer to avoid glibc replacements in single packages, >> because then all bugs have to be fixed in many different places. In >> fact, I checked grep and Fedora builds it with --without-included-regex. > > > Right now I checked how grep is built in openSUSE via > "configure --disable-silent-rules --without-included-regex"
Right thing to do, if you ask me... > I do not care too much which kind of locale specific ordering > or collating or regex behaviour is actually implemented > as long as it works consistently in grep, gawk, sed, bash,... > > I would very much appreciate it if grep, gawk, sed, bash,... > could agree on one same behaviour and provide clear > documentation for those who compile it what the > "commonly accepted upstream behaviour" is so that > the binaries get built with that same behaviour > by all distributors who like to be in compliance > with upstream decisions. Right now only gawk is different from the others, and not in a very clean manner: #ifndef GAWK /* Defer to the system regex library about the meaning of range expressions. */ regex_t re; char pattern[6] = { '[', 0, '-', 0, ']', 0 }; char subject[2] = { 0, 0 }; c1 = c; if (case_fold) { c1 = tolower (c1); c2 = tolower (c2); } pattern[1] = c1; pattern[3] = c2; regcomp (&re, pattern, REG_NOSUB); for (c = 0; c < NOTCHAR; ++c) { if ((case_fold && isupper (c)) || (MB_CUR_MAX > 1 && btowc (c) == WEOF)) continue; subject[0] = c; if (regexec (&re, subject, 0, NULL, 0) != REG_NOMATCH) setbit_case_fold_c (c, ccl); } regfree (&re); #else c1 = c; if (case_fold) { c1 = tolower (c1); c2 = tolower (c2); } for (c = c1; c <= c2; c++) setbit_case_fold_c (c, ccl); #endif I would suggest distros to rip out the #else part of this #ifndef. Paolo