On Tue, 22 Jan 2008, Prof Brian Ripley wrote: > On Wed, 23 Jan 2008, David Scott wrote: > >> >> I have encountered a problem with reading a .csv file on a linux box. I >> can read the file on my windows machine (under XP) but on the linux box it >> gives : >> >>> patients <- read.csv("../Patients.csv", header = FALSE, >> + col.names = patientsNames) >> Error in type.convert(data[[i]], as.is = as.is[i], dec = dec, >> na.strings = character(0)) : >> invalid multibyte string >> Calls: read.csv -> read.table -> type.convert >> Execution halted >> >> I am running R 2.6.1 on both machines. I tried on another linux box >> running 2.5.1 and got the same problem >> >> I am guessing it is something to do with the character encoding. On the >> linux box I have >> >> LANG=en_US.UTF-8 > > So what encoding is the .csv file in? Consider the example at the end of > ?file > > ## examples of use of encodings > cat(x, file = file("foo", "w", encoding="UTF-8")) > # read a 'Windows Unicode' file including names > A <- read.table(file("students", encoding="UCS-2LE")) > > and adapt accordingly (encoding = "CP1252" is the most likely value if this > works in English-language Windows). >
Thanks Brian for the super-quick, super-helpful reply. The encoding you suggested worked. I found a workaround myself too---I guessed that some plus/minus signs might be the problem and replaced them and could read in the file. That is just a kludge so I am using the encoding specification. I am a total dunce when it comes to encodings though. How do you find the encoding of a file? David _________________________________________________________________ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 Auckland 1142, NEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics Director of Consulting, Department of Statistics ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.