On Tue, Aug 9, 2011 at 9:57 AM, Johannes Egner <johannes.eg...@gmail.com> wrote: > On Mon, Aug 8, 2011 at 6:44 PM, Gabor Grothendieck > <ggrothendi...@gmail.com> wrote: >> On Mon, Aug 8, 2011 at 9:16 AM, Johannes Egner <johannes.eg...@gmail.com> >> wrote: >>> Hi, >>> >>> I'm removing non-unique time indices in a zoo time series by means of >>> aggregate. The time series is bivariate, and the row to be kept only depends >>> on the maximum of one of the two columns. Here's an example: >>> >>> x <- zoo(rbind( c(1,1), c(1.1, 0.9), c(1.1, 1.1), c(1,1) ), >>> order.by=c(1,1,2,2)) >>> >>> The eventual aggregated result should be >>> >>> 1 1.1 0.9 >>> 2 1.1 1.1 >>> >>> that is, in each slice of the underlying data (a slice being all rows with >>> the same time stamp), we take the row that has maximum value in the first >>> column. (For the moment, let's not worry about several rows within the same >>> slice having the same maximum value in the first column.) >>> >>> I have tried subsetting x by >>> >>> slices <- aggregate(x[,1], by=identity, FUN=which.max) >>> >>> but ended up with something as ugly as: >>> >>> T <- length( unique(time(x)) ) >>> result <- zoo( matrix(NA, ncol=2, nrow=T), order.by=unique(time(x)) ) >>> >>> for(t in seq(length.out=T)) >>> { >>> result[t,] <- x[ time(x)==time(slices[t]) ][coredata(slices[t]),] >>> >>> } >>> >>> There must be a better way of doing this -- maybe using tapply or the plyr >>> package, but possibly something much simpler. Any pointers are very welcome. >> >> Where does the data come from in the first place? Is it being read >> in? or is it in a data frame that is converted to a zoo object? > > We can assume the most convenient choice, really. Technically, I'm > reading three equi-sized vectors (timestamps, first column, second > column) from respective rdata-files, cbind the data together, and then > make them a zoo object by ordering with the timestamps. (Hence my > example, which mimics the situation.) >
The reason I ask is that this is usually done when importing the data into zoo (rather than importing the data with duplicates and then removing them later). In this case suppose we start with DF shown below (in terms of your x object). Then the following read.zoo performs the required import and the aggregate all at once: DF <- data.frame(time = time(x), coredata(x)) z <- read.zoo(DF[order(DF$time, DF$X1), ], aggregate = function(x) tail(x, 1)) Regarding suppressing the warnings on duplicates as long as the duplicates are removed at the time of import its not an issue since the situation leading to such warnings would never arise. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.