[R] Vectorised operations

John Logsdon Wed, 18 May 2016 06:35:16 -0700

Folks

I have some very long vectors - typically 1 million long - which are
indexed by another vector, same length, with values from 1 to a few
thousand, sp each sub part of the vector may be a few hundred values long.


I want to calculate the cumulative maximum of each sub part the main
vector by the index in an efficient manner.  This can obviously be done in
a loop but the whole calculation is embedded within many other
calculations which would make everything very slow indeed.  All the other
sums are vectorised already.

For example,

A=c(1,2,1,  -3,5,6,7,4,  6,3,7,6,9, ...)
i=c(1,1,1,   2,2,2,2,2,  3,3,3,3,3, ...)

where A has three levels that are not the same but the levels themselves
are all monotonic non-decreasing.

the answer to be a vector of the same length:

R=c(1,2,2,  -3,5,6,7,7,  6,6,7,7,9, ...)

If I could reset the cumulative maximum to -1e6 (eg) at each change of
index, a simple cummax would do but I can't see how to do this.

The best way I have found so far is to use the aggregate command:

as.vector(unlist(aggregate(a,list(i),cummax)[[2]]))

but rarely this fails, returning a shorter vector than expected and seems
rather ugly,  converting to and from lists which may well be an
unnecessary overhead.

I have been trying other approaches using apply() methods but either it
can't be done using them or I can't get my head round them!

Any ideas?

Best wishes

John

John Logsdon
Quantex Research Ltd
+44 161 445 4951/+44 7717758675

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Vectorised operations

Reply via email to