> On 28 Dec 2015, at 19:58, Chris Barker <chris.bar...@noaa.gov> wrote:
> 
> On Wed, Dec 23, 2015 at 4:01 AM, Nicolas P. Rougier 
> <nicolas.roug...@inria.fr> wrote:
> But my implementation is quite slow, especially when you add one item at a 
> time:
> 
> >>> python benchmark.py
> Python list, append 100000 items: 0.01161
> Array list, append 100000 items: 0.46854
> 
> are you pre-allocating any extra space? if not -- it's going to be really, 
> really pokey when adding a little bit at a time.


Yes, I’m preallocating but it might not be optimal at all given your 
implementation is much faster.
I’ll try to adapt your code. Thanks.


> 
> With my Accumulator class:
> 
> https://github.com/PythonCHB/NumpyExtras/blob/master/numpy_extras/accumulator.py
> 
> I pre-allocate a larger numpy array to start, and it gets re-allocated, with 
> some extra, when filled, using ndarray.resize()
> 
> this is quite fast.
> 
> These are settable parameters in the class:
> 
> DEFAULT_BUFFER_SIZE = 128 # original buffer created.
> BUFFER_EXTEND_SIZE = 1.25 # array.array uses 1+1/16 -- that seems small to me.
> 
> 
> I looked at the code in array.array (and list, I think), and it does stuff to 
> optimize very small arrays, which I figured wasn't the use-case here :-)
> 
> But I did a bunch of experimentation, and as long as you pre-allocate _some_ 
> it doesn't make much difference how much :-)
> 
> BTW,
> 
> I just went in an updated and tested the Accumulator class code -- it needed 
> some tweaks, but it's working now.
> 
> The cython version is in an unknown state...
> 
> some profiling:
> 
> In [11]: run profile_accumulator.py
> 
> 
> In [12]: timeit accum1(10000)
> 
> 100 loops, best of 3: 3.91 ms per loop
> 
> In [13]: timeit list1(10000)
> 
> 1000 loops, best of 3: 1.15 ms per loop
> 
> These are simply appending 10,000 integers in a loop -- with teh list, the 
> list is turned into a numpy array at the end. So it's still faster to 
> accumulate in a list, then make an array, but only a about a factor of 3 -- I 
> think this is because you are staring with a python integer -- with the 
> accumulator function, you need to be checking type and pulling a native 
> integer out with each append. but a list can append a python object with no 
> type checking or anything.
> 
> Then the conversion from list to array is all in C.
> 
> Note that the accumulator version is still more memory efficient...
> 
> In [14]: timeit accum2(10000)
> 
> 100 loops, best of 3: 3.84 ms per loop
> 
> this version pre-allocated the whole internal buffer -- not much faster the 
> buffer re-allocation isn't a big deal (thanks to ndarray.resize using 
> realloc(), and not creating a new numpy array)
> 
> In [24]: timeit list_extend1(100000)
> 
> 100 loops, best of 3: 4.15 ms per loop
> 
> In [25]: timeit accum_extend1(100000)
> 
> 1000 loops, best of 3: 1.37 ms per loop
> 
> This time, the stuff is added in chunks 100 elements at a time -- the chunks 
> being created ahead of time -- a list with range() the first time, and an 
> array with arange() the second. much faster to extend with arrays...
> 
> -CHB
> 
> 
> 
> -- 
> 
> Christopher Barker, Ph.D.
> Oceanographer
> 
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> chris.bar...@noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to