This can be further simplified by combining the 2 subs into a single gsub('[$,]','',as.character(y)).
This will then convert "$123$35,24,,$1$$2,,3.4" into a number when you may have wanted something like that to give a warning and/or NA value. The g in gsub stands for global (meaning replace every '$' and ',' not just the first one) rather than greedy (which has a different meaning in regular expressions). This discussion brings up a related issue that I have thought about for a while. In the help for read.table in the section on colClasses it says that you can specify other conversions from character as long as there is a method for as corresponding to what you put in. This suggests to me the approach of writing a conversion function called something like "as.dollar" then setting colClasses=c('numeric','dollar','dollar','factor') or something like that and having the middle 2 columns run through the function. However my first quick attempt failed (the doc says the method needs to be in the methods package and my quick attempt with setMethod created a local copy). There is also the possible problem that this would create a column with class dollar when I want a simple numeric. So this brings up 2 questions: 1. has anyone found a way to create a method for as in the methods package such that my idea above would work? (preferable without much more work than the post-processing already suggested). 2. If the answer to 1 above is no, are others interested in this type of functionality and we should move the discussion to r-devel as a feature request? Even nicer would be a simple way to go from a single character vector to multiple columns in the data frame, I remember working with a file once where the 1st 3 columns were comma separated (no spaces), but everything after that was white space separated. I read it in as whitespace separated, then had to post process the 1st column into 3. But getting all the semantics of 1 to multiple could be tricky. That particular case could also have been easier if the sep argument to read.table could be a regular expression, but that would probably slow things down for the simple cases. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of David Winsemius > Sent: Thursday, May 06, 2010 4:47 AM > To: Wang, Kevin (SYD) > Cc: r-help@r-project.org; Phil Spector > Subject: Re: [R] Converting dollar value (factors) to numeric > > > On May 5, 2010, at 11:31 PM, Wang, Kevin (SYD) wrote: > > > Hi Phil and all those who replied, > > > > Thanks heap! Yes it worked to a certain extent. However, if I have > > the > > following case: > >> x <- c("$135,359.00", "$135359.00", "$1,135,359.00") > >> y <- sub('\\$','',as.character(x)) > >> cost <- as.numeric(sub('\\,','',as.character(y))) > > Try gsub, it seems to be more "greedy" : > > cost <- as.numeric(gsub('\\,','',as.character(y))) > > -- > David > > Warning message: > > NAs introduced by coercion > >> cost > > [1] 135359 135359 NA > > > > Then the third value bcomes NA -- though I suspect it's probably has > > something to do with regular expression (which I'm not sure how to > > fix) > > than R? > > > > Thanks again for the help! > > > > Cheers > > Kev > > > > -----Original Message----- > > From: Phil Spector [mailto:spec...@stat.berkeley.edu] > > Sent: Wednesday, 5 May 2010 6:14 PM > > To: Wang, Kevin (SYD) > > Cc: r-help@r-project.org > > Subject: Re: [R] Converting dollar value (factors) to numeric > > > > Kev- > > The most reliable way to do the conversion is as follows: > > > >> x = factor(c('$112.11','$119.15','$121.32')) > >> as.numeric(sub('\\$','',as.character(x))) > > [1] 112.11 119.15 121.32 > > > > This way negative quantities and numbers without dollar signs are > > handled correctly. There's certainly no need to create a new input > > file. > > > > It may be easier to understand as > > > > as.numeric(sub('$','',as.character(x),fixed=TRUE)) > > > > which gives the same result. > > - Phil Spector > > Statistical Computing Facility > > Department of Statistics > > UC Berkeley > > spec...@stat.berkeley.edu > > > > > > On Wed, 5 May 2010, Wang, Kevin (SYD) wrote: > > > >> Hi, > >> > >> I'm trying to read in a bunch of CSV files into R where many columns > >> are coded like $111.11. When reading them in they are treated as > > factors. > >> > >> I'm wondering if there is an easy way to convert them into numeric > in > >> R (as I don't want to modify the source data)? I've done some > >> searches and can't seem to find an easy way to do this. > >> > >> I apologise if this is a trivial question, I haven't been using R > for > >> a while. > >> > >> Many thanks in advance! > >> > >> Cheers > >> > >> Kev > >> > >> Kevin Wang > >>> Senior Advisor, Health and Human Services Practice Government > >>> Advisory Services > >>> > >>> KPMG > >>> 10 Shelley Street > >>> Sydney NSW 2000 Australia > >>> > >>> Tel +61 2 9335 8282 > >>> Fax +61 2 9335 7001 > >>> > >> kevinw...@kpmg.com.au > >> > >>> Protect the environment: think before you print > >>> > >>> > >> > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.