Hi list! Short version: How do I convert a whole data.frame from latin1 encoding to utf8?
I get SPSS files with latin1 encoding. My OS is GNU/Linux and the locale sv_SE.utf8, and I normally interface R with Emacs/ESS. I have used the following hack to convert a data.frame in latin1 to utf8: > Sys.setlocale(category = "LC_ALL", locale = "sv_SE.iso88591") > foo <- read.spss("foo.sav", to.data.frame=TRUE) > write.table(foo, "foo.data") $ recode lat1..utf8 foo.data > Sys.setlocale(category = "LC_ALL", locale = "sv_SE.utf8") > foo <- read.table("foo.data") I have now found two problems with this approach: a) variable.labels is droped b) the order of unordered factors is changed I had just worked out a hack for a) when I realised b). b) is a problem when the factors really is ordered, but not recognized as such by read.spss (and/or not defined as such in SPSS, but since SPSS respects the numeric values of the factors anyway, users don't need to) Rather than hack around b) too, I wonder if anyone on the list know how to convert a whole data.frame from latin1 encoding to utf8? TIA -- Hans Ekbrand (http://sociologi.cjb.net) <h...@sociologi.cjb.net> A. Because it breaks the logical sequence of discussion Q. Why is top posting bad?
signature.asc
Description: Digital signature
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.