Re: [Rd] String encoding problem

peter dalgaard Thu, 07 Jul 2016 09:52:39 -0700

> On 07 Jul 2016, at 18:15 , Hadley Wickham <[email protected]> wrote:
> 
> Right - I'm aware of that.  But to me, it doesn't seem correct to
> print a string that is not a valid R string. Why is an unknown
> encoding printed like UTF-8?
>


It isn't -- no UTF-8 would have the \xbf. I may be flogging a dead horse, but 
it seems to me that there are three alternatives:

- refuse the input (x <- "\xc9\x82\xbf" gives "sorry, not a UTF-8 string" or so)
- refuse to print it (print(x) gives "cannot print non-UTF-8 string")
- what happens now

and a fourth one might be to actually allow mixing of \u0007 and \x07 and \007, 
but I suspect that there are demons down the line which is why it is not 
happening now. (Does it ring a bell with anyone?)

-pd


-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: [email protected]  Priv: [email protected]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] String encoding problem

Reply via email to