Re: charsets in debian/control

2004-12-11 Thread Paul Hampson
On Sat, Dec 11, 2004 at 04:08:12PM +0100, Shot (Piotr Szotkowski) wrote: > Hello. > Paul Hampson: > > The email address isn't important, since > > that has to be a subset of ASCII anyway. > Are the Unicode-encoded domain names > supported in (modern) browsers only? > > I can surf to http://Å.pl

Re: charsets in debian/control

2004-12-11 Thread Michal Politowski
On Sat, 11 Dec 2004 16:08:12 +0100, Shot (Piotr Szotkowski) wrote: > Hello. > > Paul Hampson: > > > The email address isn't important, since > > that has to be a subset of ASCII anyway. > > Are the Unicode-encoded domain names > supported in (modern) browsers only? > > I can surf to http://ł.pl

Re: charsets in debian/control

2004-12-11 Thread Marco d'Itri
On Dec 11, "Shot (Piotr Szotkowski)" <[EMAIL PROTECTED]> wrote: > I can surf to http://?.pl/ (with, e.g., Firefox) - can I send mail to > [EMAIL PROTECTED], or should I always use the [EMAIL PROTECTED] equivalent, as > the Unicode in domain names is restricted to WWW only? It depends on your MUA.

Re: charsets in debian/control

2004-12-11 Thread Shot (Piotr Szotkowski)
Hello. Paul Hampson: > The email address isn't important, since > that has to be a subset of ASCII anyway. Are the Unicode-encoded domain names supported in (modern) browsers only? I can surf to http://ł.pl/ (with, e.g., Firefox) - can I send mail to [EMAIL PROTECTED], or should I always use th

Re: charsets in debian/control

2004-12-08 Thread Thaddeus H. Black
It is one thing spiritedly to argue a point against friends and allies. It is another to be obstinate. I do not wish the latter, and I admit that I am both outnumbered and outreasoned today. Please permit me without malice to conform my position, which now might be stated as follows. Unicode

Re: charsets in debian/control

2004-12-08 Thread Steve Langasek
On Tue, Dec 07, 2004 at 05:56:54PM +, Thaddeus H. Black wrote: > > But yes, non-ASCII Latin-1 chars should not be given > > special status over the national chars found in other > > languages spoken by project members. Debian should be > > using either ASCII, or Unicode; standardizing on > > L

RE: charsets in debian/control

2004-12-07 Thread Julian Mehnle
Thaddeus H. Black wrote: > However, the typical roster of skills one masters in contributing > broadly to Debian development is already awesome: C, C++, CPP, Make, > Perl, Python, Autoconf, CVS, Shell, Glibc, System calls, /proc, IPC, > sockets, Sed, Awk, Vi, Emacs, locales, Libdb, GnuPG, Readline,

Re: charsets in debian/control

2004-12-07 Thread Petter Reinholdtsen
[Thaddeus H. Black] > UTF-8 is neat, but I do not really like Unicode (you may [Marco d'Itri] > Actually you do not even understand it, because this sentence is > meaningless. Perhaps he is aware of the difference between Unicode and ISO-10646? UTF-8 is an encoding of ISO-10646.

Re: charsets in debian/control

2004-12-07 Thread Marco d'Itri
On Dec 07, "Thaddeus H. Black" <[EMAIL PROTECTED]> wrote: > UTF-8 is neat, but I do not really like Unicode (you may Actually you do not even understand it, because this sentence is meaningless. -- ciao, | Marco | [9639 coubl1Ib61SmA] signature.asc Description: Digital signature

Re: charsets in debian/control

2004-12-07 Thread Thaddeus H. Black
Steve Langasek writes, > ... most of the letters you listed here are specific > to the IPA, which would have no use at all in a > control file as they're not part of the writing system > of any natural language. Ok. > Encodings and charsets are distinct concepts. Just > because the file is spec

Re: charsets in debian/control

2004-12-07 Thread Daniel Burrows
On Tuesday 07 December 2004 10:40 am, Richard Atterer wrote: > No, you do not have to do this. You can keep working with "char", the > changes when switching to UTF-8 will mostly have to deal with the fact that > one Unicode character is represented by more than one char. This means that > you need

Re: charsets in debian/control

2004-12-07 Thread Matthew Garrett
Daniel Burrows <[EMAIL PROTECTED]> wrote: > iso-8859-1 is an 8-bit charset, while Unicode is a 32-bit [0] charset. =20 > Storing and manipulating iso-8859-1 strings requires no changes to internal= >=20 > datatypes (only conversions for input and output); storing and manipulating= >=20 > Unicode

Re: charsets in debian/control

2004-12-07 Thread Richard Atterer
On Tue, Dec 07, 2004 at 10:17:17AM -0500, Daniel Burrows wrote: > On Tuesday 07 December 2004 12:44 am, Peter Samuelson wrote: > > And if the app already deals with charset conversions but assumes > > iso-8859-1 input, then it's trivial to fix it to assume utf-8 input. > > This is not true. > >

Re: charsets in debian/control

2004-12-07 Thread Daniel Burrows
On Tuesday 07 December 2004 10:17 am, Daniel Burrows wrote: > complex replacement string class Admittedly, "complex" might (hypothetically) be a bit of an exaggeration. :P Daniel -- /--- Daniel Burrows <[EMAIL PROTECTED]> --\ | You are in a maze of

Re: charsets in debian/control

2004-12-07 Thread Daniel Burrows
On Tuesday 07 December 2004 12:44 am, Peter Samuelson wrote: > > Defining the character set as utf-8 means that any non-unicode > > capable application is going to have issues, yes. > > Postulate an app that is ignorant of character sets - we'll call it > "aptitude". Fixing it to make it accept ut

Re: charsets in debian/control

2004-12-07 Thread Eugeniy Meshcheryakov
07.12.2004 Ð 13:33 +0100 Maciej Dems ÑÐÐ(-ÐÐ): > Patrze w ekran, a to Roger Leigh pisze do mnie: > > - No UTF-8 console keymaps > > - Some broken libraries e.g. GTK+ 1.2 [obsolete] > > - I can't paste UTF-8 into emacs (perhaps a problem in my .emacs) > > - mc making mess with its frames > Add

Re: charsets in debian/control

2004-12-07 Thread Maciej Dems
Patrze w ekran, a to Roger Leigh pisze do mnie: > - No UTF-8 console keymaps > - Some broken libraries e.g. GTK+ 1.2 [obsolete] > - I can't paste UTF-8 into emacs (perhaps a problem in my .emacs) - mc making mess with its frames Maciek -- M.Sc. Maciej Dems [EMAIL PROTECTED] ---

Re: charsets in debian/control

2004-12-07 Thread Andreas Barth
* Roger Leigh ([EMAIL PROTECTED]) [041207 00:40]: > I think going to UTF-8 as the default locale charmap for all locales > is a feasable goal for etch, as is recoding everything to UTF-8 (where > it makes sense). "feasable goal" and "etch" are the magic words I think: I agree on that, but I don't

Re: charsets in debian/control

2004-12-07 Thread Adrian 'Dagurashibanipal' von Bidder
On Tuesday 07 December 2004 00.19, Roger Leigh wrote: > I think going to UTF-8 as the default locale charmap for all locales > is a feasable goal for etch, as is recoding everything to UTF-8 (where > it makes sense). Yep. My biggest problem right now is 'lpr ' to a postscript printer (I use cup

Re: charsets in debian/control

2004-12-07 Thread Peter Samuelson
[Roger Leigh] > I've been using Debian with UTF-8 only locales for over 12 months > now. I now consider it fine for general use, with respect to > terminal and application support. Unlike a couple of years ago, most > things work perfectly. Some apps like 'screen' do not just configure themselv

Re: charsets in debian/control

2004-12-06 Thread Peter Samuelson
[Matthew Garrett] > Defining the character set as utf-8 means that any non-unicode > capable application is going to have issues, yes. Postulate an app that is ignorant of character sets - we'll call it "aptitude". Fixing it to make it accept utf-8 and spit out the correct encoding for its LC_CT

Re: charsets in debian/control

2004-12-06 Thread Mike Hommey
On Mon, Dec 06, 2004 at 07:10:21PM -0800, Steve Langasek <[EMAIL PROTECTED]> wrote: > On Tue, Dec 07, 2004 at 12:04:56PM +0900, Mike Hommey wrote: > > On Mon, Dec 06, 2004 at 06:53:42PM -0800, Steve Langasek <[EMAIL > > PROTECTED]> wrote: > > > But yes, non-ASCII Latin-1 chars should not be given

Re: charsets in debian/control

2004-12-06 Thread Steve Langasek
On Tue, Dec 07, 2004 at 12:04:56PM +0900, Mike Hommey wrote: > On Mon, Dec 06, 2004 at 06:53:42PM -0800, Steve Langasek <[EMAIL PROTECTED]> > wrote: > > But yes, non-ASCII Latin-1 chars should not be given special status over > > the national chars found in other languages spoken by project member

Re: charsets in debian/control

2004-12-06 Thread Mike Hommey
On Mon, Dec 06, 2004 at 06:53:42PM -0800, Steve Langasek <[EMAIL PROTECTED]> wrote: > But yes, non-ASCII Latin-1 chars should not be given special status over > the national chars found in other languages spoken by project members. > Debian should be using either ASCII, or Unicode; standardizing o

Re: charsets in debian/control

2004-12-06 Thread Steve Langasek
On Mon, Dec 06, 2004 at 06:58:10PM +, Thaddeus H. Black wrote: > I would not disagree with Peter or Daniel. They are > right in my view. However, consider the following > Unicode characters: > 025A LATIN SMALL LETTER SCHWA WITH HOOK > 025E LATIN SMALL LETTER CLOSED REVERSED OPEN E > 02

Re: charsets in debian/control

2004-12-06 Thread Roger Leigh
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Andreas Barth <[EMAIL PROTECTED]> writes: > Though I agree on your last statement (and please, remember, I'm from > germany where non-ASCII-characters are also in common use), I still > consider that UTF-8-not-ASCII has not finally reached ok, but it'

Re: charsets in debian/control

2004-12-06 Thread Matthew Garrett
Thaddeus H. Black <[EMAIL PROTECTED]> wrote: > We are not speaking of a stricken Polish L, a > double-accented Magyar O, or a euro sign. We are > speaking of... well, to tell the truth I have no idea > what these letters are. Have you? More to the point, > should you and I learn to recognize su

Re: charsets in debian/control

2004-12-06 Thread Bruce Perens
Thaddeus H. Black wrote: 025A LATIN SMALL LETTER SCHWA WITH HOOK 025E LATIN SMALL LETTER CLOSED REVERSED OPEN E 0261 LATIN SMALL LETTER SCRIPT G 0264 LATIN SMALL LETTER RAMS HORN 0267 LATIN SMALL LETTER HENG WITH HOOK 027A LATIN SMALL LETTER TURNED R WITH LONG LEG 027F LATIN SMALL LETTER REV

Re: charsets in debian/control

2004-12-06 Thread Thaddeus H. Black
I would not disagree with Peter or Daniel. They are right in my view. However, consider the following Unicode characters: 025A LATIN SMALL LETTER SCHWA WITH HOOK 025E LATIN SMALL LETTER CLOSED REVERSED OPEN E 0261 LATIN SMALL LETTER SCRIPT G 0264 LATIN SMALL LETTER RAMS HORN 0267 LATIN

Re: charsets in debian/control

2004-12-06 Thread Goswin von Brederlow
Daniel Burrows <[EMAIL PROTECTED]> writes: > On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote: >> > Would Peter permit me a mild dissent?  I prefer Latin-1.  Reason: I can >> > recognize and distinguish Latin-1 characters, even when I do not always >> > understand the words they

Re: charsets in debian/control

2004-12-06 Thread Adrian 'Dagurashibanipal' von Bidder
On Sunday 05 December 2004 20.11, Goswin von Brederlow wrote: > Any parser that acceps 8bit non-ascii chars > will accept UTF-8 then. What remains is just making the UTF-8 chars > visually correct then. And make sure that, where character strings are modified, the multibyte sequences are counted

Re: charsets in debian/control

2004-12-05 Thread Paul Hampson
On Mon, Dec 06, 2004 at 01:40:27AM +, Andrew Suffield wrote: > On Sun, Dec 05, 2004 at 09:32:00PM +0100, Jose Carlos Garcia Sogo wrote: > > But the only field in UTF8 should be Maintainer, and that field should > > have (IMHO) also a roman transliterate for the name, if you don't use a > > lat

Re: charsets in debian/control

2004-12-05 Thread Paul Hampson
On Mon, Dec 06, 2004 at 09:26:57AM +0900, Mike Hommey wrote: > On Mon, Dec 06, 2004 at 09:54:36AM +1100, Paul Hampson <[EMAIL PROTECTED]> > wrote: > > Isn't there a proposal around for > > Description#en: > > Description#ja: > And you'd advocate to write the English text in latin1 and the japan

Re: charsets in debian/control

2004-12-05 Thread Andrew Suffield
On Sun, Dec 05, 2004 at 09:32:00PM +0100, Jose Carlos Garcia Sogo wrote: > But the only field in UTF8 should be Maintainer, and that field should > have (IMHO) also a roman transliterate for the name, if you don't use a > latin charset (Greek, Arabic, Japanese, Chinese...) The transliterated fiel

RE: charsets in debian/control

2004-12-05 Thread Julian Mehnle
Thaddeus H. Black wrote: > I do not deny that Latin-1 represents all the languages I can read, and > that this fact may color my view. Nevertheless to me a source written > in Chinese is effectively non-free. It might as well be a compiled > binary blob. So Emacs is effectively non-free, becaus

Re: charsets in debian/control

2004-12-05 Thread Peter Samuelson
[Thaddeus H. Black] > Would Peter permit me a mild dissent? I prefer Latin-1. Dissents are fine. (: The reason to go with UTF-8 is for consistency. Tools that wish to render text onto the screen ought to be able to depend on knowing the encoding that text is in. See below for why I (and many

Re: charsets in debian/control

2004-12-05 Thread Josselin Mouette
Le lundi 06 décembre 2004 à 09:26 +0900, Mike Hommey a écrit : > On Mon, Dec 06, 2004 at 09:54:36AM +1100, Paul Hampson <[EMAIL PROTECTED]> > wrote: > > Isn't there a proposal around for > > Description#en: > > Description#ja: > > And you'd advocate to write the English text in latin1 and the j

Re: charsets in debian/control

2004-12-05 Thread Mike Hommey
On Mon, Dec 06, 2004 at 09:54:36AM +1100, Paul Hampson <[EMAIL PROTECTED]> wrote: > Isn't there a proposal around for > Description#en: > Description#ja: And you'd advocate to write the English text in latin1 and the japanese text in euc-jp ? Let's make it clear: 1 text file, 1 encoding. Mike

Re: charsets in debian/control

2004-12-05 Thread Paul Hampson
On Sun, Dec 05, 2004 at 04:42:24PM -0500, Daniel Burrows wrote: > On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote: > > > Would Peter permit me a mild dissent?  I prefer Latin-1.  Reason: I can > > > recognize and distinguish Latin-1 characters, even when I do not always > > > und

Re: charsets in debian/control

2004-12-05 Thread Daniel Burrows
On Sunday 05 December 2004 03:32 pm, Jose Carlos Garcia Sogo wrote: > > Would Peter permit me a mild dissent?  I prefer Latin-1.  Reason: I can > > recognize and distinguish Latin-1 characters, even when I do not always > > understand the words they spell.  Recognizing and distinguishing the > > ch

Re: charsets in debian/control

2004-12-05 Thread Jose Carlos Garcia Sogo
El dom, 05-12-2004 a las 20:16 +, Thaddeus H. Black escribiÃ: > Peter Samuelson writes, > > > We seem to be moving to a de facto standard of UTF-8 for non-ASCII > > characters in debian/control files. This is not specified in Policy > > [1], but for hopefully obvious reasons, consistency is a

Re: charsets in debian/control

2004-12-05 Thread Thaddeus H. Black
Peter Samuelson writes, > We seem to be moving to a de facto standard of UTF-8 for non-ASCII > characters in debian/control files. This is not specified in Policy > [1], but for hopefully obvious reasons, consistency is a Good Thing, > and UTF-8 seems to be the best solution for this sort of thin

Re: charsets in debian/control

2004-12-05 Thread Bernd Eckenfels
On Sun, Dec 05, 2004 at 06:40:52PM +0100, Goswin von Brederlow wrote: > On that note, how likely is it to hit a UTF-8 character encoding that > contains a '\n'? Any non UTF-8 aware parser would assume a new line > has started and get parse errors. Thats no problem. The only problem you have with U

Re: charsets in debian/control

2004-12-05 Thread Goswin von Brederlow
Bart Schuller <[EMAIL PROTECTED]> writes: > On Sun, Dec 05, 2004 at 06:40:52PM +0100, Goswin von Brederlow wrote: >> On that note, how likely is it to hit a UTF-8 character encoding that >> contains a '\n'? Any non UTF-8 aware parser would assume a new line >> has started and get parse errors. > >

Re: charsets in debian/control

2004-12-05 Thread Bart Schuller
On Sun, Dec 05, 2004 at 06:40:52PM +0100, Goswin von Brederlow wrote: > On that note, how likely is it to hit a UTF-8 character encoding that > contains a '\n'? Any non UTF-8 aware parser would assume a new line > has started and get parse errors. 0% likely, guaranteed. UTF-8 is *designed* to be

Re: charsets in debian/control

2004-12-05 Thread Goswin von Brederlow
Josselin Mouette <[EMAIL PROTECTED]> writes: > Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit : >> I think most of us agree that non-UTF-8-characters are not a good idea >> (please note the UTF-8-characters is a superset of ASCII). For some >> places (like package names), I thi

Re: charsets in debian/control

2004-12-05 Thread Denis Barbier
[Peter Samuelson] > I suggest that the affected source packages[3] be run through the > command 'iconv -f ORIGINAL_CHARSET -t utf-8' as soon as convenient. No, as you noticed this list is short and can be processed in a more elegant manner, e.g. sympa description uses a no-break space where a norm

Re: charsets in debian/control

2004-12-05 Thread Peter Samuelson
[Marco d'Itri] > > Would people support a mass bug at minor severity? > Make it normal. Given that Policy recommends debian/changelog to be utf-8, coupled with the observation (which I had not thought of) that various tools may require a maintainer's name in debian/control and debian/changelog to

Re: charsets in debian/control

2004-12-05 Thread Peter Samuelson
[Steinar H. Gunderson] > Transliterating is somewhat of a kludge (and I think in most cases > UTF-8 is a much better solution); OTOH I'd rapidly get confused in > the list of Japanese maintainers if their names weren't > transliterated. I think it's a valid choice for a maintainer who natively sp

Re: charsets in debian/control

2004-12-05 Thread Marco d'Itri
On Dec 05, "Steinar H. Gunderson" <[EMAIL PROTECTED]> wrote: > Transliterating is somewhat of a kludge (and I think in most cases UTF-8 is a > much better solution); OTOH I'd rapidly get confused in the list of Japanese > maintainers if their names weren't transliterated. This is a different issue

Re: charsets in debian/control

2004-12-05 Thread Marco d'Itri
On Dec 05, Peter Samuelson <[EMAIL PROTECTED]> wrote: > Would people support a mass bug at minor severity? Make it normal. -- ciao, | Marco | [9589 inOGrPyJFNKhM] signature.asc Description: Digital signature

Re: charsets in debian/control

2004-12-05 Thread Steinar H. Gunderson
On Sun, Dec 05, 2004 at 01:01:16PM +0100, Josselin Mouette wrote: > Many of us have names that can't be written using ASCII. Well, they usually can be transliterated, can't they? Transliterating is somewhat of a kludge (and I think in most cases UTF-8 is a much better solution); OTOH I'd rapidly

Re: charsets in debian/control

2004-12-05 Thread Andreas Barth
* Josselin Mouette ([EMAIL PROTECTED]) [041205 13:05]: > Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit : > > I think most of us agree that non-UTF-8-characters are not a good idea > > (please note the UTF-8-characters is a superset of ASCII). For some > > places (like package n

Re: charsets in debian/control

2004-12-05 Thread Josselin Mouette
Le dimanche 05 décembre 2004 à 11:43 +0100, Andreas Barth a écrit : > I think most of us agree that non-UTF-8-characters are not a good idea > (please note the UTF-8-characters is a superset of ASCII). For some > places (like package names), I think most of us even agree that only > ASCII-characte

Re: charsets in debian/control

2004-12-05 Thread Andreas Barth
* Petter Reinholdtsen ([EMAIL PROTECTED]) [041205 11:30]: > [Peter Samuelson] > > We seem to be moving to a de facto standard of UTF-8 for non-ASCII > > characters in debian/control files. This is not specified in Policy > > [1], but for hopefully obvious reasons, consistency is a Good Thing, > >

Re: charsets in debian/control

2004-12-05 Thread Petter Reinholdtsen
[Peter Samuelson] > We seem to be moving to a de facto standard of UTF-8 for non-ASCII > characters in debian/control files. This is not specified in Policy > [1], but for hopefully obvious reasons, consistency is a Good Thing, > and UTF-8 seems to be the best solution for this sort of thing. Som

Re: charsets in debian/control

2004-12-05 Thread Peter Samuelson
[Peter Samuelson] > I suggest that the affected source packages[3] be run through the > command 'iconv -f ORIGINAL_CHARSET -t utf-8' as soon as convenient. Ehhh, I see I have already ruined my credibility by pasting the wrong source package list. The real list is much shorter. Apologies, Peter

charsets in debian/control

2004-12-05 Thread Peter Samuelson
We seem to be moving to a de facto standard of UTF-8 for non-ASCII characters in debian/control files. This is not specified in Policy [1], but for hopefully obvious reasons, consistency is a Good Thing, and UTF-8 seems to be the best solution for this sort of thing. In my sid control files, I s