Dear Milan, thank you for kind suggestion. Converting the characters using: > iconv(department, "ISO-8859-15", "UTF-8") indeed improves the situation in that now all values (names of departments) are displayed in the plot, although the specific special characters are unfortunately appearing as empty boxes.
I have tried to see whether I could 'save' my state file using UTF-8 format, and although this proves to be a popular request it does not seem to have been incorporated in Stata. Best and thank you for your help, Richard On 11 Dec 2012, at 12:11, Milan Bouchet-Valat <nalimi...@club.fr> wrote: > Le mardi 11 décembre 2012 à 01:10 +0100, Richard Zijdeman a écrit : >> Dear all, >> >> I have imported a dataset from Stata using the foreign package. The >> original data contain French characters such as and . >> After importing, string variables containing names of French >> departments have changed. E.g. Ardche became Ard\x8fche. I would like >> to ask how I could plot these changed strings, since now the strings >> with special characters fail to be printed in the plot (either using >> plot() or ggplot2()). >> >> I have googled for solutions, but actually find it hard to determine >> whether I should change my R setup or should read in the data in a >> different way. Since I work on a mac I changed my local according to >> the R for Mac OS X FAQ, chapter 9. Below is some info on my setup and >> code and output on what works for me and what does not. Thank you in >> advance for you comments. > Accentuated characters should work fine on a machine using a UTF-8 > locale as yours. I think the problem is that the imported data uses > ISO8859-15 or UTF-16, not UTF-8. > > I have no idea whether .dta files specify an encoding or not, but I > think you can convert them in R by calling > iconv(department, "ISO-8859-15", "UTF-8") > or > iconv(department, "UTF-16", "UTF-8") > >> Best, >> >> Richard >> >> #-------------- >> rm(list=ls()) >> sessionInfo() >> # R version 2.15.2 (2012-10-26) >> # Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) >> # >> # locale: >> # [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >> >> # creating variables >> department <- c("Nord","Paris","Ard\x8fche") > \x8 does not correspond to "è" AFAIK. In ISO8859-1 and -15 and UTF-16, > it's \xE8 ("\uE8" in R). > > In UTF-8, it's C3 A8, "\303\250" in R. > >> department2 <- c("Nord", "Paris", "Ardche") >> n <- c(2,4,1) >> >> # creating dataframes >> df <- data.frame(department,n) >> df2 <- data.frame(department2,n) >> >> department >> # [1] "Nord" "Paris" "Ard\x8fche" >> department2 >> # [1] "Nord" "Paris" "Ardche" >> >> plot(df) # fails to show the text "Ardche" >> plot(df2) # shows text "Ardche" >> >> # EOF >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.