On Wed, Sep 18, 2002 at 03:25:41PM +0200, Niels Möller wrote: > > > Nobody will use UTF-32 as their local encoding for the foreseeable > > future, right? > > I really don't know. Right now, utf8 seeems almost as impractical as > utf-32 to me, and I don't know how that will change when more programs > pick up support for larger character sets.
I can not imagine how you will use something that is not 7bit ASCII backwards compatible on a Unix-like terminal. A lot of interfaces have to change before that is remotely possible, not to talk of all the applications like the shell, filesystem, etc. A lot of things use \0 to mark the end of a string. Having three of those in every ASCII-range character is somewhat inconvenient. > > There is no advantage whatsoever to use the same encoding in the input and > > output half. Both are completely separated. > > They're the same program and the same binary, so at least it's less > code bloat to add unicode support to the second half than to the > first. > > Using unicode somewhere in the input path seems necessary, and if you > follow Rolands idea of putting more of term into the console server, > then the console server seems to be the right place. If you don't do > that, then I agree that the console need not know about it, and just pass > the utf8 stream on to term. I have no idea what you are talking about. It seems to be related to the thread, but you have to be much more precise. What is this Unicode support you are talking about, that my second half seems to be missing? Here is the input half one more time, just to make sure that there isn't a simple misunderstanding: The console clients write UTF-8 encoded unicode to the console server via the console/NR/input node. The console server converts this input stream to the local encoding (the one that it also receives characters in). This can be isolat1, UTF-8, or whatever. Even UTF-32 if you want. This is taken from the --encoding option to the console server, which also determines the output conversion. The console server provides the locally encoded strings to the term server. > > I think all this legacy chinese/japanese/korean stuff, bu . Of > > course we could just hard code UTF-8 support in term (that's what a > > patch for the Linux kernel does), but that is kinda cheap. ;) > > Special casing utf8 is a reasonable thing to do in almost all cases Maybe even in this case. I guess we will find out. But actually I consider it simpler to just use the mb* functions than to roll my own. Thanks, Marcus -- `Rhubarb is no Egyptian god.' GNU http://www.gnu.org [EMAIL PROTECTED] Marcus Brinkmann The Hurd http://www.gnu.org/software/hurd/ [EMAIL PROTECTED] http://www.marcus-brinkmann.de/ _______________________________________________ Bug-hurd mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-hurd