On 02/07/2013 13:31, Jeremy Ng wrote:
Hi guys,
I was wondering if any one is able to help me on a problem that I was stuck
with for a long time. It involves the replacement of character strings with
numbers. The character string can take on only 3 possible values, for
instance:
AA
AT
TT
I would want R to replace AT with 0. Between AA and TT, I want to compare
the frequency of either value, and then for the one which occurs more, I
want it to be replaced with 1, and the other with -1. So using the same
example, say, I have
AA - 50
AT-34
TT- 57
I would want R to substitute it in this way:
AA= -1
AT=0
TT = 1
The strings are not necessarily AA,AT, or TT.
If not, how are we to know which one is to be replaced by 0? And does
'more' mean 'greater than' or 'greater than or equal to'?
Adapt the following depending on your answers
> set.seed(1)
> x <- sample(c(rep("AA", 2), "AT", rep("TT", 3)))
> fr <- table(x)
> recode <- if(fr[1] < fr[3]) c(-1, 0, 1) else c(1, 0, -1) # or <=
> x
[1] "AA" "TT" "AT" "TT" "AA" "TT"
> recode[match(x, names(fr))] # or however the strings are arranged.
[1] -1 1 0 1 -1 1
Any ideas?
Thanks!
Jeremy
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.