On 17.07.2011 15:18, Nipesh Bajaj wrote:
I really sorry if I understood your statement correctly :(

You said:
" To put a backslash in the replacement expression of sub or gsub
(when fixed=FALSE) use 4 backslashes"

I understood it is okay if I want to replace something with 2
backslashes. what if I want to replace that with just 1 backslash? I
have tried following however didn't work (R is asking few more input):

gsub("d","\\\",my.data$animals)

You said:
"replacement expression backslash-digit means to use the digit'th
parenthesized subpattern as the replacement"

Would you please elaborate this phenomena?  If I use "backslash-digit
= 6" then I dont see any difference in the end result:
gsub("d","\\\\\\",my.data$animals)
[1] "\\og" "wolf" "cat"

Really helpful if you elaborate more on these issues.


Yes, because that translates (after R's processing) to "\\\" and end up after the real replacement in the string "\\\og"

If you interpret that it means 1 backslash (coming from the first two), an (escaped) "o" which is the same as a regular "o" and finally that "g".

Uwe Ligges



Thanks,

On Sun, Jul 17, 2011 at 8:34 AM, William Dunlap<wdun...@tibco.com>  wrote:
To put a backslash in the replacement expression
of sub or gsub (when fixed=FALSE) use 4 backslashes.
The rationale is that the replacement expression
backslash-digit means to use the digit'th parenthesized
subpattern as the replacement and backslash-backslash means
to put in a literal backslash.  However, R parser also uses
backslashes to signify things like unicode characters (that
backslash is not in the string stored by R, but is just a
signal to the parser) and it requires a doubled backslash
to enter a backslash.  2*2 is 4 backslashes.  E.g.,

  >  gsub("([[:digit:]]+)([[:alpha:]]+)", "alpha=<<\\2>>\\\\numeric=<<\\1>>", c("12P", 
"34Cat"))
  [1] "alpha=<<P>>\\numeric=<<12>>"   "alpha=<<Cat>>\\numeric=<<34>>"
  >  cat(.Last.value, sep="\n") # see what is really in the strings
  alpha=<<P>>\numeric=<<12>>
  alpha=<<Cat>>\numeric=<<34>>

I don't know about your unicode/encoding problem.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

-----Original Message-----
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Sverre Stausland
Sent: Saturday, July 16, 2011 7:20 PM
To: r-help@r-project.org
Subject: [R] gsub() with unicode and escape character

Dear helpers,

I'm trying to replace a character with a unicode code inside a data
frame using gsub(), but unsuccessfully.

data.frame(animals=c("dog","wolf","cat"))->my.data
gsub("o","\u0254",my.data$animals)->my.data$animals
my.data$animals
[1] "dɔg"  "wɔlf" "cat"

It's not that a data frame cannot have unicode codes, cf. e.g.

data.frame(animals=c("d\u0254g","w\u0254lf","cat"))->my.data.2
my.data.2$animals
[1] dɔg  wɔlf cat
Levels: cat d<U+0254>g w<U+0254>lf

I've done the best I can based on what ?gsub and ?enc2utf8 tell me,
but I haven't found a solution.

Unrelated to that problem, but related to gsub() is that I can't find
a way for gsub() to interpret the backslash as a character. In regular
expression, \\ should represent "the character \", but gsub() doesn't:

data.frame(animals=c("dog","wolf","cat"))->my.data
gsub("d","\\",my.data$animals)
[1] "og"   "wolf" "cat"

Thank you
Sverre

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to