On Tue, Jul 19, 2011 at 3:35 PM, Carlos Becker <carlosbec...@gmail.com> wrote: > Thanks Chad for the explanation on those details. I am new to python and I
> However, if I don't, I obtain this 4x penalty with numpy, even with the > 8092x8092 array. Would it be possible to do k = m - 0.5 and pre-alllocate k > such that python does not have to waste time on that? I suspect the 4x penalty is related to the expression evaluation overhead (temporaries and copying), so hopefully numexpr() will help, or just remembering to use the in-place operators whenever appropriate. To answer your question, though, you can allocate an array, without initializing it, with the empty() function. Note - if you aren't absolutely sure you are going to overwrite every single element of the array, this could leave you with uninitialized values in your array. I'd just go ahead and use the zeros() function instead, to be safe (it's initialized outside the timeit() timing loop): %python >>> import timeit >>> import numpy as np >>> t=timeit.Timer('k = m - 0.5', setup='import numpy as np;m = >>> np.ones([8092,8092],float); k = np.zeros(m.size, m.dtype)') >>> np.mean(t.repeat(repeat=10, number=1)) 0.58557529449462886 >>> t=timeit.Timer('k = m - 0.5', setup='import numpy as np;m = >>> np.ones([8092,8092],float)') >>> np.mean(t.repeat(repeat=10, number=1)) 0.53153839111328127 >>> t=timeit.Timer('m =- 0.5', setup='import numpy as np;m = >>> np.ones([8092,8092],float)') >>> np.mean(t.repeat(repeat=10, number=1)) 0.038648796081542966 As you can see, preallocation doesn't seem to affect the results all that much, it's the overhead of creating a temporary, then copying it to the result, that seems to matter here. The in-place operation was much faster. Here we see that just copying m to k, takes up more time than the 'k = m + 0.5' operation: >>> t=timeit.Timer('k = m.copy()', setup='import numpy as np;m = >>> np.ones([8092,8092],float)') >>> np.mean(t.repeat(repeat=10, number=1)) 0.63301105499267574 Possibly that is because 8K*8K matrices are a bit too big for this kind of benchmark; I recommend also trying it with 4K*4K, and your original 2K*2K to see if the results are consistent. Remember, the timeit() setup is hiding the initial allocation time of m from the results, but it still exists, and should be accounted for in determining the overall execution time of the in-place operation results. Also, with these large array sizes, make sure these tests are in a fresh python instance, so that the process address space isn't tainted with old object allocations (which may cause your OS to 'swap' the now unused memory, and ruin your timing values). -Chad _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion