On 25.01.2012 02:16, R. Michael Weylandt wrote:
I think you are getting stuck on the same regexp problem as before
(i.e., once again the dollar sign is being interpreted as the
beginning
You meant "end".
Uwe
of the line rather than an actual dollar sign)
If I understand your question, might I suggest something much easier?
x = data.frame(a = c("$1034.23","1,230"), b = c(4,5))
sapply(x, function(x) as.numeric(gsub("[\\$,]","",x)))
That is, go by each column of the data frame and replace anything
that's either a literal dollar sign or a comma with empty space (i.e.,
remove it) and then convert the result to numeric. If it's already
numeric, this will simply return it unaltered so I think it's safe to
apply to each row.
M
On Tue, Jan 24, 2012 at 11:07 AM, Dan Abner<dan.abne...@gmail.com> wrote:
Hi everyone,
I am using Michael's approach (grepl()) to identify which columns
containing $ signs. I was hoping to incorporate this into a line of
code that would automatically 1) find which columns contain $ signs,
2) strip the $ and commas, and 3) convert the result to a numeric
vector.
I have the following:
col.id<-function(x) any(grepl("\\$",x))
cand2[which(sapply(cand2,col.id))]<-
as.numeric(gsub("[$,]","",cand2[which(sapply(cand2,col.id))]))
However, I am doing something wrong: while the code correctly
identifies the columns containing $ signs, it also returns ALL NA for
those columns.
See my initial message for this thread for example data.
Any assistance is appreciated.
Thanks!
Dan
On Tue, Jan 24, 2012 at 9:04 AM, R. Michael Weylandt
<michael.weyla...@gmail.com> wrote:
Either
any(grepl("$",x, fixed = TRUE)) # You probably want grepl not grep
any(grepl("\\$",x) )
? regexpr # $ has a special value
Michael
PS -- Stop with HTML postings (seriously, it actually does mess up
what the rest of us see and I think it causes trouble for the archives
as well)
On Tue, Jan 24, 2012 at 8:49 AM, Dan Abner<dan.abne...@gmail.com> wrote:
Hello everyone,
I am writing my own function to return the column index of all variables
(these are currently character vectors) in a data frame that contain a
dollar sign($). A small piece of the data look like this:
can_sta can_zip ind_ite_con ind_uni_con AL 36106 $251,895.80 $22,874.43
AL 35802 $141,373.60 $7,100.00 AL 35201 $273,208.50 $18,193.66 AR
72404 $186,918.00
$25,391.00 AR 72217 $451,127.00 $27,255.23 AR 7.28E+08 $58,336.22 $5,293.82
So far I have:
col.id<-function(x) any(grep("$",x))
sapply(cand2,col.id)
However, this returns TRUE for all columns (even those that do not contain
the $).
Any assistance is appreciated.
Thank you,
Dan
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.