On Thu, Feb 17, 2011 at 10:02 AM, Alex F. Bokov
<[email protected]> wrote:
> Motivation: during each iteration, my code needs to collect tabular data (and
> use it only during that iteration), but the rows of data may vary. I thought
> I would speed it up by preinitializing the matrix that collects the data with
> zeros to what I know to be the maximum number of rows. I was surprised by
> what I found...
>
> # set up (not the puzzling part)
> x<-matrix(runif(20),nrow=4); y<-matrix(0,nrow=12,ncol=5); foo<-c();
There is no purpose in initializing foo here. Your assignment in the
second version overwrites any assignment here.
> # this is what surprises me... what the?
>> system.time(for(i in 1:100000){n<-sample(1:4,1);y[1:n,]<-x[1:n,];});
> user system elapsed
> 1.510 0.000 1.514
This version performs extraction from x and assignment into a
submatrix of y. The second version performs only the extraction and
assignment to a name in the evaluation environment, which is a much
faster operation.
>> system.time(for(i in 1:100000){n<-sample(1:4,1);foo<-x[1:n,];});
> user system elapsed
> 1.090 0.000 1.085
>
> These results are very repeatable. So, if I'm interpreting them correctly,
> dynamically allocating 'foo' each time to whatever the current output size is
> runs faster than writing to a subset of a preallocated 'y'? How is that
> possible?
>
> And, more generally, I'm sure other people have encountered this type of
> situation. Am I reinventing the wheel? Is there a best practice for storing
> temporary loop-specific data?
>
> Thanks.
>
> PS: By the way, though I cannot write to foo[,] because the size is
> different each time, I tried writing to foo[] and the runtime was worse than
> either of the above examples.
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.