Folks I have some very long vectors - typically 1 million long - which are indexed by another vector, same length, with values from 1 to a few thousand, sp each sub part of the vector may be a few hundred values long.
I want to calculate the cumulative maximum of each sub part the main vector by the index in an efficient manner. This can obviously be done in a loop but the whole calculation is embedded within many other calculations which would make everything very slow indeed. All the other sums are vectorised already. For example, A=c(1,2,1, -3,5,6,7,4, 6,3,7,6,9, ...) i=c(1,1,1, 2,2,2,2,2, 3,3,3,3,3, ...) where A has three levels that are not the same but the levels themselves are all monotonic non-decreasing. the answer to be a vector of the same length: R=c(1,2,2, -3,5,6,7,7, 6,6,7,7,9, ...) If I could reset the cumulative maximum to -1e6 (eg) at each change of index, a simple cummax would do but I can't see how to do this. The best way I have found so far is to use the aggregate command: as.vector(unlist(aggregate(a,list(i),cummax)[[2]])) but rarely this fails, returning a shorter vector than expected and seems rather ugly, converting to and from lists which may well be an unnecessary overhead. I have been trying other approaches using apply() methods but either it can't be done using them or I can't get my head round them! Any ideas? Best wishes John John Logsdon Quantex Research Ltd +44 161 445 4951/+44 7717758675 ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.