* Martin-Éric Racine: > On Tue, Sep 8, 2009 at 12:51 PM, Hilko Bengen<ben...@debian.org> wrote: >> As far as I know, using non-ASCII characters in the GECOS field of >> /etc/passwd is not specified at all. So far, I haven't found anything >> Debian's main policy file, passwd(5), the adduser(8) and useradd(8) >> manpages, nor the documentation of base-passwd. (If you have found more >> than I, let me know.) > > While it is not specified, it has become a de-facto standard in Debian > and its derivatives to use UTF-8 for everything, including the real > name that appears in the GECOS field of /etc/passwd.
I had also thought about UTF-8 becoming the standard encoding in many places in Debian, be it de-iure or de-facto. But I am not going to assume that this extends to /etc/passwd. And how should non-ASCII characters in other kinds of user databases be treated, such as NIS or LDAP? >> From an application's standpoint, I'd tend to assume the GECOS field >> either to be a comma-sparated string of ASCII characters or a >> comma-separated string of byte values. > We cannot assume that anymore now that Debian uses UTF-8 for everything. If you can point me to a text passage in the policy (or any relevant discussion on the mailing lists), I will be happy to reconsider my opinion. >> Basing mailx' interpretation of the GECOS field on the sendcharset >> variable, as you suggested is probably not a good idea. > Why not? (it's sendcharsets, sorry for the typo) sendcharsets is about the target charset. >>> Message-Id: <1252199465.516755.6377.nullmai...@iki.fi> >>> From: Martin-?ric Racine <q-f...@iki.fi> >> As a workaround, please try setting your real name to a pre-encoded >> string in the .mailrc. > Do you really expect all users on a given system to start doing that, > just because their name includes non-ascii characters? Not at all. I just thought that this workaround might be helpful for you until the larger issues get sorted out. Feel free to ignore my suggestion. :-) > Please remember that both Debian and Ubuntu nowadays allow non-ascii > GECOS content under the presumption that it will be in UTF-8. They have alway allowed non-ascii content in the GECOS field, but I see no such presumption. >From the sources I have seen, existing tools for manipulating /etc/passwd will happily accept *any* byte sequence from the terminal. If an administrator has still set his console to iso-8859-1, that's what is used, without conversion. -Hilko -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org