[R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

Emmanuel Levy Tue, 12 Aug 2008 16:37:19 -0700

Dear All,

I have a large data frame ( 2700000 lines and 14 columns), and I would like to
extract the information in a particular way illustrated below:



Given a data frame "df":

> col1=sample(c(0,1),10, rep=T)
> names = factor(c(rep("A",5),rep("B",5)))
> df = data.frame(names,col1)
> df
   names col1
1      A    1
2      A    0
3      A    1
4      A    0
5      A    1
6      B    0
7      B    0
8      B    1
9      B    0
10     B    0

I would like to tranform it in the form:

> index = c("A","B")
> col1[[1]]=df$col1[which(df$name=="A")]
> col1[[2]]=df$col1[which(df$name=="B")]

My problem is that the command:  *** which(df$name=="A") ***
takes about 1 second because df is so big.

I was thinking that a "level" could maybe be accessed instantly but I am not
sure about how to do it.

I would be very grateful for any advice that would allow me to speed this up.

Best wishes,

Emmanuel

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] which(df$name=="A") takes ~1 second! (df is very large), but can it be speeded up?

Reply via email to