I recall this discussion on UTF-8 and locales: http://undeadly.org/cgi?action=article&sid=20150722182236
I imagine there was more elsewhere that I didn't see. Brian On Sep 22, 2015 7:07 AM, "Benny Lofgren" <bl-li...@lofgren.biz> wrote: > On 2015-09-21 16:45, Mark Kettenis wrote: > >> From: Christian Weisgerber <na...@mips.inka.de> > >> Date: Mon, 21 Sep 2015 14:29:03 +0000 (UTC) > >> > >> On 2015-09-21, Stefan Sperling <s...@stsp.name> wrote: > >> > > You could argue that the thousands separator should be supported though: > > > > $ sleep 1.000.000 > > > > if your locale is something vaguely european, and > > > > # sleep 1,000,000 > > > > for the north-americans. > > > > But let's not go there... > > In my locale (Sweden), the thousands separator is space... try to > meaningfully parse sleep 1 000 000 without all of a sudden introducing > mandatory quoting. :-) So no, let's not go there... > > > Like many others I've seen plenty of trainwrecks caused by mindless > internationalization in other operating systems, and how it wreaks havoc > in scripts and interpretation of piped data. It is NOT pretty. > > The stance that the base system should only speak English and use > time-tried and known formats is IMO the correct one. > > > With that said, we do live in an international world and I was alarmed > about something Stefan wrote earlier in this thread: > > > When we still had latin1 etc. it was possible in theory that values > between > > 128 and 255 represented a digit. But now, isdigit() does the same > regardless > > of locale setting (C or UTF-8) since it cannot be given a multibyte > sequence, > > i.e. it will not deal with character values above 127. > > Maybe I'm the only one that this is news to (and I admit I obviosly > haven't followed tech@ and misc@ closely enough lately), but am I > interpreting this correctly in understanding that 8-bit locales except > for the "C" locale (which is effectively a 7-bit locale) are gone? > > > That would be extremely unfortunate, since UTF-8 (even if it was fully > implemented in OpenBSD which of course it isn't) is *not* the sole > answer to the world's localization problems. At least not in the > forseeable future. > > For example, I still have plenty of systems running that absolutely > relies on a working ISO 8859-1 environment, and not just in the sense > that it is 8-bit transparent but in understanding collation orders, > proper case folding and functioning of the is...() macros/functions etc. > > > I tried to google for more on this issue, but came up pretty thin. Has > this been discussed publicly somewhere? If this is due to shortage of > manpower or interest or something like that, maybe we can do something > about that. > > I'm sure it isn't still a matter of lack of knowledge of the problems > facing us non-native English speakers - since I know OpenBSD developers > come from all corners of the world - so please someone tell me I've > misunderstood, or that it's just a bad dream and I'll wake up soon. :-) > > > * * * > > Are there more people like me who really could use more complete i18n > support in OpenBSD (multibyte as well as legacy 8-bit) and are willing > to make efforts into implementing it? > > Is there a coordinated effort somewhere? If not, let's discuss it! > Offlist is fine of course if it isn't of general interest (but I think > it is). And by "implementing", of course I mean doing it right, the > OpenBSD way. > > If there is interest but no coordination, I can start it off and try to > make an inventory of the current state of implementation and what > efforts may already be ongoing (or dormant or abandoned, due to lack of > interest/time/manpower/resources/pizza/beer/hikes). > > What say ye? > > * * * > > > Regards, > /Benny > >