I think need to do something like this: dat<-data.frame(state=sample(id=rep(1:5,each=200),1:3, 1000, replace=T,prob=c(0.7,0.05,0.25)),V1=runif(1,10,1000),V2=rnorm(1000)) rle.dat<-rle(dat$state) temp<-1 out<-data.frame(id=1:length(rle.dat$length)) for(i in 1:length(rle.dat$length)){ temp2<-temp+rle.dat$length[[i]] out$V1[i]<-mean(dat$V1[temp:temp2]) out$V2[i]<-sum(dat$V2[temp:temp2]) out$state[i]<-rle.dat$value[[i]] temp<-temp2 }
to a very large dataset. I want to apply a few summary functions to some variables within a data.frame for given states. to complicate things, id like to use plyr and split on the id variable before i do any of this... loop.func<-function(dat){ rle.dat<-rle(dat$state) temp<-1 out<-data.frame(id=1:length(rle.dat$length)) for(i in 1:length(rle.dat$length)){ temp2<-temp+rle.dat$length[[i]] out$V1[i]<-mean(dat$V1[temp:temp2]) out$V2[i]<-sum(dat$V2[temp:temp2]) out$state[i]<-rle.dat$value[[i]] temp<-temp2 } return(out) } out<-ddply(dat,.(id),loop.func) mostly, i just don't understand how to use a list (especially in this instance) in a plyr/apply statement... Thanks, Justin ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.