On Tue, Jul 19, 2011 at 4:05 AM, Carlos Becker <carlosbec...@gmail.com> wrote: > Hi, I started with numpy a few days ago. I was timing some array operations > and found that numpy takes 3 or 4 times longer than Matlab on a simple > array-minus-scalar operation.
Doing these kinds of timings correctly is a tricky issue, and the method you used is at fault. It is testing more than just the vectorized array-minus-scalar operation, it is also timing a range() call and list creation for the loop, as well as vector result object creation and deletion time, both of which add constant overhead to the result (which is itself rather small and susceptible to overhead bias). Whereas the matlab loop range equivalent is part of the syntax itself, and can therefore be optimized better. And depending on the type of garbage collection Matlab uses, it may defer the destruction of the temporaries until after the timing is done (ie. when it exits, whereas Python has to destruct the object on each loop they way you've written it.) First of all, use the 'timeit' module for timing: %python >>> import timeit >>> t=timeit.Timer('k = m - 0.5', setup='import numpy as np;m = >>> np.ones([2000,2000],float)') >>> np.mean(t.repeat(repeat=100, number=1)) 0.022081942558288575 That will at least give you a more accurate timing of just the summing expression itself, and not the loop overhead. Furthermore, you can also reuse the m array for the sum, rather than allocating a new one, which will give you a better idea of just the vectorized subtration time: >>> t=timeit.Timer('m -= 0.5', setup='import numpy as np;m = >>> np.ones([2000,2000],float)') >>> np.mean(t.repeat(repeat=100, number=1)) 0.015955450534820555 Note that the value has dropped considerably. In the end, what you are attempting to time is fairly simple, so any extra overhead you add that is not actually the vectorized sum, will bias your results. You have to be extremely careful with these timing comparisons, since you may be comparing apples to oranges. At the least, try to give the vectorizing code much more work to do, for example you are summing only over about 32 Megs. Try about half a gig, and compare that with Matlab, in order to reduce the percentage of overhead to summing in your timings: >>> t=timeit.Timer('m -= 0.5', setup='import numpy as np;m = >>> np.ones([8092,8092],float)') >>> np.mean(t.repeat(repeat=100, number=1)) 0.26796033143997194 Try comparing these examples to your existing Matlab timings, and you should find Python w/ numpy comparing favorably (or even beating Matlab). Of course, then you could improve your Matlab timings; in the end they should be almost the same when done properly. If not, by all means let us know. -Chad _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion