Am .05.2015, 18:43 Uhr, schrieb Duncan Murdoch <murdoch.dun...@gmail.com>:

On 25/05/2015 11:37 AM, Ista Zahn wrote:
AFAIK this is the way it works on Windows. It has been discussed in several
places, e.g.
http://stackoverflow.com/questions/17715956/why-do-some-unicode-characters-display-in-matrices-but-not-data-frames-in-r
,
http://stackoverflow.com/questions/17715956/why-do-some-unicode-characters-display-in-matrices-but-not-data-frames-in-r
(both of these came up when I googled the subject line of your email).

Yes, but it is a bug, just a hard one to fix. It needs someone to dedicate a serious amount of time to deal with it.

Since most of the people who tend to do that generally use systems in UTF-8 locales where this isn't a problem, or don't use Windows, it is languishing.

Duncan Murdoch


I understand that these problems are not easy to fix but ...

I think that
"most of the people who tend to do that generally use systems in UTF-8 locales" is a biased perception. Developers might tend to use Mac or Linux most often. For others Windows still is and probably will be the OS most often used. For most of them switching to something else is a major hurdle.

What I often witness is that those non existent Windows users try to muddle through with numerous calls to Encoding() , iconv() and the like while at the same time never being sure if the strange behavior is due to their lack of understanding, Windows specifics or due to R. In the end they either succeed with their muddling or give up, - but do not change the system.

So whoever might attempt the Hercules task will be praised by thousands ;-)

Best, Peter



Best,
Ista
On May 25, 2015 9:39 AM, "Richard Cotton" <richiero...@gmail.com> wrote:

> Here's a data frame with some Unicode symbols (set intersection and union).
>
> d <- data.frame(x = "A \u222a B \u2229 C")
>
> Printing this data frame under R 3.2.0 patched (r68378) and Windows 7, I
> see
>
> d
> ##                  x
> ## 1 A <U+222A> B n C
>
> Printing the column itself works fine.
>
> d$x
> ## [1] A ∪ B ∩ C
> ## Levels: A ∪ B ∩ C
>
> The encoding is correctly UTF-8.
>
> Encoding(as.character(d$x))
> ## [1] "UTF-8"
>
> Under Linux both forms of printing are fine for me.
>
> I'm not quite sure whether I've missed a setting or if this is a bug, so
>
> Am I doing something silly?
> Can anyone else reproduce this?
>
> --
> Regards,
> Richie
>
> Learning R
> 4dpiecharts.com
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to