On Mon, Aug 8, 2011 at 6:44 PM, Gabor Grothendieck <ggrothendi...@gmail.com> wrote: > On Mon, Aug 8, 2011 at 9:16 AM, Johannes Egner <johannes.eg...@gmail.com> > wrote: >> Hi, >> >> I'm removing non-unique time indices in a zoo time series by means of >> aggregate. The time series is bivariate, and the row to be kept only depends >> on the maximum of one of the two columns. Here's an example: >> >> x <- zoo(rbind( c(1,1), c(1.1, 0.9), c(1.1, 1.1), c(1,1) ), >> order.by=c(1,1,2,2)) >> >> The eventual aggregated result should be >> >> 1 1.1 0.9 >> 2 1.1 1.1 >> >> that is, in each slice of the underlying data (a slice being all rows with >> the same time stamp), we take the row that has maximum value in the first >> column. (For the moment, let's not worry about several rows within the same >> slice having the same maximum value in the first column.) >> >> I have tried subsetting x by >> >> slices <- aggregate(x[,1], by=identity, FUN=which.max) >> >> but ended up with something as ugly as: >> >> T <- length( unique(time(x)) ) >> result <- zoo( matrix(NA, ncol=2, nrow=T), order.by=unique(time(x)) ) >> >> for(t in seq(length.out=T)) >> { >> result[t,] <- x[ time(x)==time(slices[t]) ][coredata(slices[t]),] >> >> } >> >> There must be a better way of doing this -- maybe using tapply or the plyr >> package, but possibly something much simpler. Any pointers are very welcome. > > Where does the data come from in the first place? Is it being read > in? or is it in a data frame that is converted to a zoo object?
We can assume the most convenient choice, really. Technically, I'm reading three equi-sized vectors (timestamps, first column, second column) from respective rdata-files, cbind the data together, and then make them a zoo object by ordering with the timestamps. (Hence my example, which mimics the situation.) Incidentally, after some thought, I have found a neater (and much faster) way. Each slice reports both its length and the position of the maximum entry back via aggregate, and we then subset appropriately: ##################################### x <- zoo(rbind( c(1,1), c(1.1, 0.9), c(1.1, 1.1), c(1,1) ), order.by=c(1,1,2,2)) indices.prelim <- aggregate(x[, 1], by=identity, FUN=function(x) c(which.max(x), length(x))) cumShift <- cumsum( coredata(indices.prelim[,2]) ) cumShift <- c(0, cumShift[-length(cumShift)]) shift <- coredata(indices.prelim[,1]) indices <- shift+cumShift result <- x[indices, ] ##################################### Suggestions nonetheless welcome. And Gabor -- any way to turn off the warning message for zoo objects when 'order.by' indices are not unique? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.