Re: [Rd] gsub, utf-8 replacements and the C-locale

2011-11-23 Thread Simon Urbanek
On Nov 23, 2011, at 6:48 PM, Hadley Wickham wrote: > Hi all, > > I'd like to discuss a infelicity/possible bug with gsub. Take the > following function: > > f <- function(x) { > gsub("\u{A0}", " ", gsub(" ", "\u{A0}", x)) > } > > As you might expect, in utf-8 locales it is idempotent: > > S

[Rd] gsub, utf-8 replacements and the C-locale

2011-11-23 Thread Hadley Wickham
Hi all, I'd like to discuss a infelicity/possible bug with gsub. Take the following function: f <- function(x) { gsub("\u{A0}", " ", gsub(" ", "\u{A0}", x)) } As you might expect, in utf-8 locales it is idempotent: Sys.setlocale("LC_ALL", "UTF-8") f("x y") # [1] "x y" But in the C locale it