>>>>> On Mon, 6 Jun 2016, Ulrich Mueller wrote:

>>>>> On Mon, 6 Jun 2016, Chí-Thanh Christopher Nguyễn wrote:
>> I'm not totally convinced yet.
>> Following the BCP-47 spec the format is

>> Language-Tag  = langtag             ; normal language tags
>> langtag       = language
>> ["-" script]
>> ["-" region]
>> *("-" variant)
>> *("-" extension)
>> ["-" privateuse]

>> [...]

> As I understand it:

> 1. Gettext documentation says that locale names can be LL_CC or
> LL_CC@VARIANT. The natural mapping to the (implementation defined)
> format mentioned by POSIX seems to be that LL, CC, and VARIANT
> correspond to language, territory, and modifier, respectively.

> 2. Language codes are taken from ISO 639, namely the two-letter code
> if one exists, otherwise the three-letter code.

> 3. Territory codes are taken from ISO 3166-1, usually the two-letter
> country codes.

> 4. According to Gettext documentation, "'@VARIANT' can denote any
> kind of characteristics that is not already implied by the language
> LL and the country CC." (So IIUC the BCP-47 variant "valencia" would
> become "@valencia".)

Of course, we could also say that Gettext/POSIX syntax (especially its
variant/modifier part) is ill-defined, and use BCP-47 syntax for the
L10N USE_EXPAND instead (except that the separator would be an
underscore instead of a hyphen).

AFAICS, there would be no change at all for any of the LL or LL_CC
entries. The only ones that would change would be the (about 10) ones
containing an @ sign. For example, ca@valencia would become
ca_valencia, and sr@ijekavianlatin would become sr_Latn_ijekavsk.

Not sure how much additional code for remapping would be required.
However, my impression is that upstream usage of @VARIANT is not at
all standardised, so some remapping would be required in any case if
we want unique entries for L10N.

Ulrich

Attachment: pgpX6YOzUkd3k.pgp
Description: PGP signature

Reply via email to