Thomas Lumley <[EMAIL PROTECTED]> writes: > On Thu, 26 Oct 2006, Henrik Bengtsson wrote: > > > I'm observing the following on different platforms: > > > >> parse(text='"\\x7F"') > > expression("\177") > >> parse(text='"\\x80"') > > Error: invalid multibyte string > > Yes. It's an invalid multibyte string. In UTF-8 a single byte is a valid > character string only if it is below x80, so x7F is fine but x80 is not. > In fact x80 is not the leading byte of any valid UTF-8 character. > > You have to work out what the Unicode code point is for whatever character > you were expecting to be x80 and convert that to UTF-8. > > I'm surprised that one of your UTF-8 machines worked -- I don't think it > should.
Interestingly, we can parse, but not print or deparse: > x<-parse(text='"\\x80"') > x Error: invalid multibyte string > z <- deparse(x) Error in deparse(x) : invalid multibyte string > cat(x[[1]]) �> (the last line has a funny little cedilla-like symbol in pos 1) -- O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel