I recall this discussion on UTF-8 and locales:

http://undeadly.org/cgi?action=article&sid=20150722182236

I imagine there was more elsewhere that I didn't see.

Brian

On Sep 22, 2015 7:07 AM, "Benny Lofgren" <bl-li...@lofgren.biz> wrote:

> On 2015-09-21 16:45, Mark Kettenis wrote:
> >> From: Christian Weisgerber <na...@mips.inka.de>
> >> Date: Mon, 21 Sep 2015 14:29:03 +0000 (UTC)
> >>
> >> On 2015-09-21, Stefan Sperling <s...@stsp.name> wrote:
> >>
> > You could argue that the thousands separator should be supported though:
> >
> >   $ sleep 1.000.000
> >
> > if your locale is something vaguely european, and
> >
> >   # sleep 1,000,000
> >
> > for the north-americans.
> >
> > But let's not go there...
>
> In my locale (Sweden), the thousands separator is space... try to
> meaningfully parse sleep 1 000 000 without all of a sudden introducing
> mandatory quoting. :-) So no, let's not go there...
>
>
> Like many others I've seen plenty of trainwrecks caused by mindless
> internationalization in other operating systems, and how it wreaks havoc
> in scripts and interpretation of piped data. It is NOT pretty.
>
> The stance that the base system should only speak English and use
> time-tried and known formats is IMO the correct one.
>
>
> With that said, we do live in an international world and I was alarmed
> about something Stefan wrote earlier in this thread:
>
> > When we still had latin1 etc. it was possible in theory that values
> between
> > 128 and 255 represented a digit. But now, isdigit() does the same
> regardless
> > of locale setting (C or UTF-8) since it cannot be given a multibyte
> sequence,
> > i.e. it will not deal with character values above 127.
>
> Maybe I'm the only one that this is news to (and I admit I obviosly
> haven't followed tech@ and misc@ closely enough lately), but am I
> interpreting this correctly in understanding that 8-bit locales except
> for the "C" locale (which is effectively a 7-bit locale) are gone?
>
>
> That would be extremely unfortunate, since UTF-8 (even if it was fully
> implemented in OpenBSD which of course it isn't) is *not* the sole
> answer to the world's localization problems. At least not in the
> forseeable future.
>
> For example, I still have plenty of systems running that absolutely
> relies on a working ISO 8859-1 environment, and not just in the sense
> that it is 8-bit transparent but in understanding collation orders,
> proper case folding and functioning of the is...() macros/functions etc.
>
>
> I tried to google for more on this issue, but came up pretty thin. Has
> this been discussed publicly somewhere? If this is due to shortage of
> manpower or interest or something like that, maybe we can do something
> about that.
>
> I'm sure it isn't still a matter of lack of knowledge of the problems
> facing us non-native English speakers - since I know OpenBSD developers
> come from all corners of the world - so please someone tell me I've
> misunderstood, or that it's just a bad dream and I'll wake up soon. :-)
>
>
> * * *
>
> Are there more people like me who really could use more complete i18n
> support in OpenBSD (multibyte as well as legacy 8-bit) and are willing
> to make efforts into implementing it?
>
> Is there a coordinated effort somewhere? If not, let's discuss it!
> Offlist is fine of course if it isn't of general interest (but I think
> it is). And by "implementing", of course I mean doing it right, the
> OpenBSD way.
>
> If there is interest but no coordination, I can start it off and try to
> make an inventory of the current state of implementation and what
> efforts may already be ongoing (or dormant or abandoned, due to lack of
> interest/time/manpower/resources/pizza/beer/hikes).
>
> What say ye?
>
> * * *
>
>
> Regards,
> /Benny
>
>

Reply via email to