Re: [Rd] latin1,utf-8...encoding and data

2006-10-25 Thread Prof Brian Ripley
This is indeed unfortunate, but expecting Chinese speakers (20% of the world's population) to write in Latin-1 was also unfortunate. What I had (and still have) some hope of doing is being able to mark character strings as UTF-8, probably via a flag bit on the CHARSXP. Then output routines co

Re: [Rd] latin1,utf-8...encoding and data

2006-10-19 Thread Martin Maechler
> "Stéphane" == Stéphane Dray <[EMAIL PROTECTED]> > on Thu, 19 Oct 2006 09:46:49 +0200 writes: Stéphane> Thanks a lot for this clear answer. So there is no way to preserve our Stéphane> french cultural exception (accented characters), I agree that there are many French cult

Re: [Rd] latin1,utf-8...encoding and data

2006-10-19 Thread Stéphane Dray
Thanks a lot for this clear answer. So there is no way to preserve our french cultural exception (accented characters), if we want to be international... I have thought that the inclusion of a parameter encoding in data function (e.g. data(mydata,encoding="latin1")) like in the function 'file'

Re: [Rd] latin1,utf-8...encoding and data

2006-10-18 Thread Prof Brian Ripley
Only ASCII letters are portable: those accented characters do not even exist in many of the encodings used for R, e.g. Russian and Japanese on Windows machines. There is no way to associate an encoding with a character string in R. We considered it, but it would have had severe back-compatibi

[Rd] latin1,utf-8...encoding and data

2006-10-18 Thread Stéphane Dray
Hello, I have some questions concerning encoding and package distribution. We develop the ade4 package. For some data sets included in the package, there are accentued character (e.g. é,è...). The data sets have been saved using latin1 encoding, but some of us use utf-8 and can not see some dat