* Martin-Éric Racine:

> On Tue, Sep 8, 2009 at 12:51 PM, Hilko Bengen<ben...@debian.org> wrote:
>> As far as I know, using non-ASCII characters in the GECOS field of
>> /etc/passwd is not specified at all. So far, I haven't found anything
>> Debian's main policy file, passwd(5), the adduser(8) and useradd(8)
>> manpages, nor the documentation of base-passwd. (If you have found more
>> than I, let me know.)
>
> While it is not specified, it has become a de-facto standard in Debian
> and its derivatives to use UTF-8 for everything, including the real
> name that appears in the GECOS field of /etc/passwd.

I had also thought about UTF-8 becoming the standard encoding in many
places in Debian, be it de-iure or de-facto. But I am not going to
assume that this extends to /etc/passwd.

And how should non-ASCII characters  in other kinds of user databases be
treated, such as NIS or LDAP?

>> From an application's standpoint, I'd tend to assume the GECOS field
>> either to be a comma-sparated string of ASCII characters or a
>> comma-separated string of byte values.
> We cannot assume that anymore now that Debian uses UTF-8 for everything.

If you can point me to a text passage in the policy (or any relevant
discussion on the mailing lists), I will be happy to reconsider my
opinion.

>> Basing mailx' interpretation of the GECOS field on the sendcharset
>> variable, as you suggested is probably not a good idea.
> Why not?

(it's sendcharsets, sorry for the typo)

sendcharsets is about the target charset.

>>> Message-Id: <1252199465.516755.6377.nullmai...@iki.fi>
>>> From: Martin-?ric Racine <q-f...@iki.fi>
>> As a workaround, please try setting your real name to a pre-encoded
>> string in the .mailrc.
> Do you really expect all users on a given system to start doing that,
> just because their name includes non-ascii characters? 

Not at all. I just thought that this workaround might be helpful for you
until the larger issues get sorted out. Feel free to ignore my
suggestion. :-)

> Please remember that both Debian and Ubuntu nowadays allow non-ascii
> GECOS content under the presumption that it will be in UTF-8.

They have alway allowed non-ascii content in the GECOS field, but I see
no such presumption.

>From the sources I have seen, existing tools for manipulating
/etc/passwd will happily accept *any* byte sequence from the terminal.
If an administrator has still set his console to iso-8859-1, that's what
is used, without conversion.

-Hilko



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to