On 1/26/2012 10:33 AM, Berend Hasselman wrote:
On 26-01-2012, at 19:10, Berend Hasselman wrote:
On 26-01-2012, at 17:58, Brian Diggs wrote:
On 1/25/2012 10:09 AM, patzoul wrote:
I have 2 series of data a,b and I would like to calculate a new series which
is z[t] = z[t-1]*a[t] + b[t] , z[1] = b[1].
How can I do that without using a loop?
........
I don't think so.
a<- c(2,4,3,5)
b<- c(1,3,5,7)
z<- rep(0,length(a))
z[1]<- b[1]
for( t in 2:length(a) ) { z[t]<- a[t] * z[t-1] + b[t] }
z
gives
[1] 1 7 26 137
and this agrees with a manual calculation.
You get a vector of length 5 as result. It should be of length 4 with your data.
If you change the Reduce expression to this
u<- Reduce(function(zm1, coef) {coef[1] * zm1 + coef[2]},
Map(c, a[-1], b[-1]),
init = b[1], accumulate = TRUE)
then you get the correct result.
u
[1] 1 7 26 137
You are correct; I had an off-by-one error. It agreed with my manual
calculation, which also had the same error.
And the loop especially if byte compiled with cmpfun appears to be quite a bit
quicker.
Nrep<- 1000
tfrml.loop<- function(a,b) {
z<- rep(0,length(a))
z[1]<- b[1]
for( t in 2:length(a) ) {
z[t]<- a[t] * z[t-1] + b[t]
}
z
}
tfrml.rdce<- function(a,b) {
u<- Reduce(function(zm1, coef) {coef[1] * zm1 + coef[2]},
Map(c, a[-1], b[-1]),
init = b[1], accumulate = TRUE)
u
}
library(compiler)
tfrml.loop.c<- cmpfun(tfrml.loop)
tfrml.rdce.c<- cmpfun(tfrml.rdce)
z.loop<- tfrml.loop(a,b)
z.rdce<- tfrml.rdce(a,b)
all.equal(z.loop, z.rdce)
library(rbenchmark)
N<- 500
set.seed(1)
a<- runif(N)
b<- runif(N)
benchmark(tfrml.loop(a,b), tfrml.rdce(a,b), tfrml.loop.c(a,b),
tfrml.rdce.c(a,b),
replications=Nrep, columns=c("test", "replications",
"elapsed"))
test replications elapsed
1 tfrml.loop(a, b) 1000 2.665
3 tfrml.loop.c(a, b) 1000 0.554
2 tfrml.rdce(a, b) 1000 4.082
4 tfrml.rdce.c(a, b) 1000 3.143
Berend
R-2.14.1 (32-bits), Mac OS X 10.6.8.
The timings are interesting; I would not have expected the loop to have
outperformed Reduce, or at least not by that much. The loop also
benefits much more from compiling, which is not as surprising since
Reduce and Map are already compiled. I would guess the difference is due
to overhead changing the format of the a/b data and being able to
specialize the code.
I also ran timings for comparison and got qualitatively the same thing:
benchmark(tfrml.loop(a,b), tfrml.rdce(a,b), tfrml.loop.c(a,b),
tfrml.rdce.c(a,b),
replications=Nrep,
columns=c("test", "replications", "elapsed", "relative"),
order="relative")
test replications elapsed relative
3 tfrml.loop.c(a, b) 1000 0.34 1.000000
1 tfrml.loop(a, b) 1000 1.89 5.558824
4 tfrml.rdce.c(a, b) 1000 2.12 6.235294
2 tfrml.rdce(a, b) 1000 2.79 8.205882
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)
(Windows 7)
--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.