I really sorry if I understood your statement correctly :( You said: " To put a backslash in the replacement expression of sub or gsub (when fixed=FALSE) use 4 backslashes"
I understood it is okay if I want to replace something with 2 backslashes. what if I want to replace that with just 1 backslash? I have tried following however didn't work (R is asking few more input): gsub("d","\\\",my.data$animals) You said: "replacement expression backslash-digit means to use the digit'th parenthesized subpattern as the replacement" Would you please elaborate this phenomena? If I use "backslash-digit = 6" then I dont see any difference in the end result: > gsub("d","\\\\\\",my.data$animals) [1] "\\og" "wolf" "cat" Really helpful if you elaborate more on these issues. Thanks, On Sun, Jul 17, 2011 at 8:34 AM, William Dunlap <wdun...@tibco.com> wrote: > To put a backslash in the replacement expression > of sub or gsub (when fixed=FALSE) use 4 backslashes. > The rationale is that the replacement expression > backslash-digit means to use the digit'th parenthesized > subpattern as the replacement and backslash-backslash means > to put in a literal backslash. However, R parser also uses > backslashes to signify things like unicode characters (that > backslash is not in the string stored by R, but is just a > signal to the parser) and it requires a doubled backslash > to enter a backslash. 2*2 is 4 backslashes. E.g., > > > gsub("([[:digit:]]+)([[:alpha:]]+)", "alpha=<<\\2>>\\\\numeric=<<\\1>>", > c("12P", "34Cat")) > [1] "alpha=<<P>>\\numeric=<<12>>" "alpha=<<Cat>>\\numeric=<<34>>" > > cat(.Last.value, sep="\n") # see what is really in the strings > alpha=<<P>>\numeric=<<12>> > alpha=<<Cat>>\numeric=<<34>> > > I don't know about your unicode/encoding problem. > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf Of Sverre Stausland >> Sent: Saturday, July 16, 2011 7:20 PM >> To: r-help@r-project.org >> Subject: [R] gsub() with unicode and escape character >> >> Dear helpers, >> >> I'm trying to replace a character with a unicode code inside a data >> frame using gsub(), but unsuccessfully. >> >> > data.frame(animals=c("dog","wolf","cat"))->my.data >> > gsub("o","\u0254",my.data$animals)->my.data$animals >> > my.data$animals >> [1] "dÉ”g" "wÉ”lf" "cat" >> >> It's not that a data frame cannot have unicode codes, cf. e.g. >> >> > data.frame(animals=c("d\u0254g","w\u0254lf","cat"))->my.data.2 >> > my.data.2$animals >> [1] dɔg wɔlf cat >> Levels: cat d<U+0254>g w<U+0254>lf >> >> I've done the best I can based on what ?gsub and ?enc2utf8 tell me, >> but I haven't found a solution. >> >> Unrelated to that problem, but related to gsub() is that I can't find >> a way for gsub() to interpret the backslash as a character. In regular >> expression, \\ should represent "the character \", but gsub() doesn't: >> >> > data.frame(animals=c("dog","wolf","cat"))->my.data >> > gsub("d","\\",my.data$animals) >> [1] "og" "wolf" "cat" >> >> Thank you >> Sverre >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.