Antoine Pitrou <solipsis <at> pitrou.net> writes: > > On Thu, 12 May 2016 06:27:43 +0000 (UTC) > Sturla Molden <sturla.molden <at> gmail.com> wrote: > > > Allan Haldane <allanhaldane <at> gmail.com> wrote: > > > > > You probably already know this, but I just wanted to note that the > > > mpi4py module has worked around pickle too. They discuss how they > > > efficiently transfer numpy arrays in mpi messages here: > > > http://pythonhosted.org/mpi4py/usrman/overview.html#communicating- python-objects-and-array-data > > > > Unless I am mistaken, they use the PEP 3118 buffer interface to support > > NumPy as well as a number of other Python objects. However, this protocol > > makes buffer aquisition an expensive operation. > > Can you define "expensive"? > > > You can see this in Cython > > if you use typed memory views. Assigning a NumPy array to a typed > > memoryview (i,e, buffer acqisition) is slow. > > You're assuming this is the cost of "buffer acquisition", while most > likely it's the cost of creating the memoryview object itself. > > Buffer acquisition itself only calls a single C callback and uses a > stack-allocated C structure. It shouldn't be "expensive". > > Regards > > Antoine. >
When I looked at it, using a typed memoryview was between 7-50 times slower than using numpy directly: http://thread.gmane.org/gmane.comp.python.cython.devel/14626 It looks like there was some improvement since then: https://github.com/numpy/numpy/pull/3779 ...and repeating my experiment shows the deficit is down to 3-11 times slower. In [5]: x = randn(10000) In [6]: %timeit echo_memview(x) The slowest run took 14.98 times longer than the fastest. This could mean that an intermediate result is being cached. 100000 loops, best of 3: 5.31 µs per loop In [7]: %timeit echo_memview_nocast(x) The slowest run took 10.80 times longer than the fastest. This could mean that an intermediate result is being cached. 1000000 loops, best of 3: 1.58 µs per loop In [8]: %timeit echo_numpy(x) The slowest run took 58.81 times longer than the fastest. This could mean that an intermediate result is being cached. 1000000 loops, best of 3: 474 ns per loop -Dave _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion