[R] rle on large data . . . without a for loop!

Justin Haynes Fri, 17 Jun 2011 15:56:50 -0700

I think need to do something like this:

dat<-data.frame(state=sample(id=rep(1:5,each=200),1:3, 1000,
replace=T,prob=c(0.7,0.05,0.25)),V1=runif(1,10,1000),V2=rnorm(1000))
rle.dat<-rle(dat$state)
temp<-1
out<-data.frame(id=1:length(rle.dat$length))
for(i in 1:length(rle.dat$length)){
        temp2<-temp+rle.dat$length[[i]]
        out$V1[i]<-mean(dat$V1[temp:temp2])
        out$V2[i]<-sum(dat$V2[temp:temp2])
        out$state[i]<-rle.dat$value[[i]]
        temp<-temp2
}


to a very large dataset.  I want to apply a few summary functions to
some variables within a data.frame for given states. to complicate
things, id like to use plyr and split on the id variable before i do
any of this...

loop.func<-function(dat){
  rle.dat<-rle(dat$state)
  temp<-1
  out<-data.frame(id=1:length(rle.dat$length))
  for(i in 1:length(rle.dat$length)){
        temp2<-temp+rle.dat$length[[i]]
        out$V1[i]<-mean(dat$V1[temp:temp2])
        out$V2[i]<-sum(dat$V2[temp:temp2])
        out$state[i]<-rle.dat$value[[i]]
        temp<-temp2
  }
  return(out)
}
out<-ddply(dat,.(id),loop.func)

mostly, i just don't understand how to use a list (especially in this
instance) in a plyr/apply statement...


Thanks,

Justin

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rle on large data . . . without a for loop!

Reply via email to