Ian Dworkin wrote:
# This is more about trying to find a more effecient way to code some
simple vectorized computations using ifelse().
# Say you have some vector representing a factor with a number of
levels (6 in this case), representing the location that samples were
collected.
Population <- gl( n=6, k=5,length=120, labels =c("CO", "CN","Ga","KO",
"Mw", "Ng"))
# You would like to assign a particular value to each level of
population (in this case the elevation at which they were collected).
In a vectorized approach (for speed... pretend this was a big data
set..)
elevation <- ifelse(Population=="CO", 2169,
ifelse(Population=="CN", 1121,
ifelse(Population=="Ga", 500,
ifelse(Population=="KO", 2500,
ifelse(Population=="Mw", 625,
ifelse(Population=="Ng", 300, NA ))))))
# Which is fine, but is a pain to write...
# So I was trying to think about how to vectorize directly. i.e use
vectors within the test, and for return values for T and F
elevation.take.2 <- ifelse(Population==c("CO", "CN", "Ga", "KO",
"Mw", "Ng"), c(2169, 1121, 500, 2500, 625, 300), c(NA, NA, NA, NA, NA,
NA))
# It makes sense to me why this does not work (elevation.take.2), but
I am not sure how to get it to work. Any suggestions? I suspect it
involves a trick using "any" or "II" or something, but I can't seem to
work it out.
In a case like this, often indexing is clearer than ifelse. For example,
results <- c(CN=1121, Ga = 500, KO=2500, Mw = 625, Ng = 300)
elevation <- results[Population]
Generally vector indexing of atomic vectors and matrices is very fast;
indexing of data frames is much slower, so if speed is an issue, avoid them.
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.