Hello,
Inline.
On 20/08/2018 01:08, Daniel Nordlund wrote:
See comment inline below:
On 8/18/2018 10:06 PM, Rui Barradas wrote:
Hello,
It also works with class "factor":
df <- data.frame(variable = c("12.6%", "30.9%", "61.4%"))
class(df$variable)
#[1] "factor"
as.numeric(gsub(pattern = "%", "", df$variable))
#[1] 12.6 30.9 61.4
This is because sub() and gsub() return a character vector and the
instruction becomes an equivalent of what the help page ?factor
documents in section Warning:
To transform a factor f to approximately its original numeric values,
as.numeric(levels(f))[f] is recommended and slightly more efficient
than as.numeric(as.character(f)).
Also, I would still prefer
as.numeric(sub(pattern = "%$","",df$variable))
#[1] 12.6 30.9 61.4
The pattern is more strict and there is no need to search&replace
multiple occurrences of '%'.
The pattern is more strict, and that could cause the conversion to fail
if the process that created the strings resulted in trailing spaces.
That's true, and I had thought of that but it wasn't in the OP's problem
description.
The '$' could still be used with something like "%\\s*$":
as.numeric(sub('%\\s*$', '', df$variable))
#[1] 12.6 30.9 61.4
Rui Barradas
Without the '$' the conversion succeeds.
df <- data.frame(variable = c("12.6% ", "30.9%", "61.4%"))
as.numeric(sub('%$', '', df$variable))
[1] NA 30.9 61.4
Warning message:
NAs introduced by coercion
<<<snip>>>
Dan
---
This email has been checked for viruses by AVG.
https://www.avg.com
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.