Re: [R] aggregate.zoo on bivariate data

Gabor Grothendieck Tue, 09 Aug 2011 07:11:21 -0700

On Tue, Aug 9, 2011 at 9:57 AM, Johannes Egner <johannes.eg...@gmail.com> wrote:
> On Mon, Aug 8, 2011 at 6:44 PM, Gabor Grothendieck
> <ggrothendi...@gmail.com> wrote:
>> On Mon, Aug 8, 2011 at 9:16 AM, Johannes Egner <johannes.eg...@gmail.com> 
>> wrote:
>>> Hi,
>>>
>>> I'm removing non-unique time indices in a zoo time series by means of
>>> aggregate. The time series is bivariate, and the row to be kept only depends
>>> on the maximum of one of the two columns. Here's an example:
>>>
>>> x <- zoo(rbind( c(1,1), c(1.1, 0.9), c(1.1, 1.1), c(1,1) ),
>>>        order.by=c(1,1,2,2))
>>>
>>> The eventual aggregated result should be
>>>
>>> 1   1.1   0.9
>>> 2   1.1   1.1
>>>
>>> that is, in each slice of the underlying data (a slice being all rows with
>>> the same time stamp), we take the row that has maximum value in the first
>>> column. (For the moment, let's not worry about several rows within the same
>>> slice having the same maximum value in the first column.)
>>>
>>> I have tried subsetting x by
>>>
>>> slices <- aggregate(x[,1], by=identity, FUN=which.max)
>>>
>>> but ended up with something as ugly as:
>>>
>>> T <- length( unique(time(x)) )
>>> result <- zoo( matrix(NA, ncol=2, nrow=T), order.by=unique(time(x)) )
>>>
>>> for(t in seq(length.out=T))
>>> {
>>>    result[t,] <- x[ time(x)==time(slices[t]) ][coredata(slices[t]),]
>>>
>>> }
>>>
>>> There must be a better way of doing this -- maybe using tapply or the plyr
>>> package, but possibly something much simpler. Any pointers are very welcome.
>>
>> Where does the data come from in the first place?  Is it being read
>> in?  or is it in a data frame that is converted to a zoo object?
>
> We can assume the most convenient choice, really. Technically, I'm
> reading three equi-sized vectors (timestamps, first column, second
> column) from respective rdata-files, cbind the data together, and then
> make them a zoo object by ordering with the timestamps. (Hence my
> example, which mimics the situation.)
>


The reason I ask is that this is usually done when importing the data
into zoo (rather than importing the data with duplicates and then
removing them later).  In this case suppose we start with DF shown
below (in terms of your x object).   Then the following read.zoo
performs the required import and the aggregate all at once:

DF <- data.frame(time = time(x), coredata(x))
z <- read.zoo(DF[order(DF$time, DF$X1), ], aggregate = function(x) tail(x, 1))

Regarding suppressing the warnings on duplicates as long as the
duplicates are removed at the time of import its not an issue since
the situation leading to such warnings would never arise.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] aggregate.zoo on bivariate data

Reply via email to