Thank you Winston for the solution! The only workaround I come up with is to
set options(encoding = "UTF-8"), which is generally undesirable.
I'm wondering is there any chance this patch will be included in future R
version? I have been running into this problem from time to time and the
latest R
After a bit more investigation, I think I've found the cause of the bug,
and I have a patch.
This bug happens with grep(), when:
* Running on Windows.
* The search uses fixed=TRUE.
* The search pattern is a single byte.
* The current locale has a multibyte encoding.
===
Here's
On Windows, grep(fixed=TRUE) throws errors with some UTF-8 strings.
Here's an example (must be run on Windows to reproduce the error):
Sys.setlocale("LC_CTYPE", "chinese")
y <- rawToChar(as.raw(c(0xe6, 0xb8, 0x97)))
Encoding(y) <- "UTF-8"
y
# [1] "渗"
grep("\n", y, fixed = TRUE)
# Error in grep("\n