Hi, On Fri, Jun 22, 2012 at 6:05 PM, Benjamin Root <ben.r...@ou.edu> wrote:
> > > On Fri, Jun 22, 2012 at 10:25 AM, Travis Oliphant <tra...@continuum.io>wrote: > >> Accessing individual elements of NumPy arrays is slower than accessing >> individual elements of lists --- around 2.5x-3x slower. NumPy has to do >> more work to figure out what kind of indexing you are trying to do because >> of its flexibility. It also has to create the Python object to return. >> In contrast, the list approach already has the Python objects created and >> you are just returning pointers to them and there is much less flexibility >> in the kinds of indexing you can do. >> >> Simple timings show that a.item(i,j) is about 2x slower than list element >> access (but faster than a[i,j] which is about 2.5x to 3x slower). The >> slowness of a.item is due to the need to create the Python object to return >> (there are just raw bytes there) so it gives some idea of the relative cost >> of each part of the slowness of a[i,j]. >> >> Also, math on the array scalars returned from NumPy will be slower than >> math on integers and floats --- because NumPy re-uses the ufunc machinery >> which is not optimized at all for scalars. >> >> The take-away is that NumPy is built for doing vectorized operations on >> bytes of data. It is not optimized for doing element-by-element >> individual access. The right approach there is to just use lists (or use >> a version specialized for the kind of data in the lists that removes the >> boxing and unboxing). >> >> Here are my timings using IPython for NumPy indexing: >> >> 1-D: >> >> In[2]: a = arange(100) >> >> In [3]: %timeit [a.item(i) for i in xrange(100)] >> 10000 loops, best of 3: 25.6 us per loop >> >> In [4]: %timeit [a[i] for i in xrange(100)] >> 10000 loops, best of 3: 31.8 us per loop >> >> In [5]: al = a.tolist() >> >> In [6]: %timeit [al[i] for i in xrange(100)] >> 100000 loops, best of 3: 10.6 us per loop >> >> >> >> 2-D: >> >> In [7]: a = arange(100).reshape(10,10) >> >> In [8]: al = a.tolist() >> >> In [9]: %timeit [al[i][j] for i in xrange(10) for j in xrange(10)] >> 10000 loops, best of 3: 18.6 us per loop >> >> In [10]: %timeit [a[i,j] for i in xrange(10) for j in xrange(10)] >> 10000 loops, best of 3: 44.4 us per loop >> >> In [11]: %timeit [a.item(i,j) for i in xrange(10) for j in xrange(10)] >> 10000 loops, best of 3: 34.2 us per loop >> >> >> >> -Travis >> >> > However, what is the timing/memory cost of converting a large numpy array > that already exists into python list of lists? If all my processing before > the munkres step is using NumPy, converting it into python lists has a > cost. Also, your timings indicate only ~2x slowdown, while the timing > tests done by eat show an order-of-magnitude difference. I suspect there > is great room for improvement before even starting to worry about the array > access issues. > To create list of list from array is quite fast, like In []: A= rand(500, 500) In []: %timeit A.tolist() 100 loops, best of 3: 10.8 ms per loop Regards, -eat > > Cheers! > Ben Root > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion