Re: Rational Ranges [was Re: gnulib regex lib]

Paul Eggert Sun, 12 Aug 2018 11:27:00 -0700

arn...@skeeve.com wrote:

The only FIXMEs I see are both in the _LIBC part of the code, and
there's only two: one in regexec.c and one in regcomp.c.


In that case I guess there isn't a problem.

I am a little concerned that unibyte locales use bytes whereas multibyte localesuse characters for range expressions. As I understand it, this means Turkishrange expressions are interpreted differently depending on whether the localeuses UTF-8 or ISO/IEC 8859-9. Is that really what Turkish-speakers want?

That being said, it doesn't matter all that much nowadays now that UTF-8 hastaken over, so it's probably not worth much of our time to worry about thisdiscrepancy. For what it's worth,https://w3techs.com/technologies/details/en-iso885909/all/all says that only0.06% of websites still use ISO/IEC 8859-9, down from 0.09% a year ago (and downfrom 0.7% in 2010, so this is a factor-of-10 decline in 8 years).

Re: Rational Ranges [was Re: gnulib regex lib]

Reply via email to