Thomas Lumley <[EMAIL PROTECTED]> writes:
> On Thu, 26 Oct 2006, Henrik Bengtsson wrote:
>
> > I'm observing the following on different platforms:
> >
> >> parse(text='"\\x7F"')
> > expression("\177")
> >> parse(text='"\\x80"')
> > Error: invalid multibyte string
>
> Yes. It's an invalid multibyte string. In UTF-8 a single byte is a valid
> character string only if it is below x80, so x7F is fine but x80 is not.
> In fact x80 is not the leading byte of any valid UTF-8 character.
>
> You have to work out what the Unicode code point is for whatever character
> you were expecting to be x80 and convert that to UTF-8.
>
> I'm surprised that one of your UTF-8 machines worked -- I don't think it
> should.
Interestingly, we can parse, but not print or deparse:
> x<-parse(text='"\\x80"')
> x
Error: invalid multibyte string
> z <- deparse(x)
Error in deparse(x) : invalid multibyte string
> cat(x[[1]])
�>
(the last line has a funny little cedilla-like symbol in pos 1)
--
O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel