On 05/11/2016 04:29 AM, Sturla Molden wrote: > 4. The reason IPC appears expensive with NumPy is because multiprocessing > pickles the arrays. It is pickle that is slow, not the IPC. Some would say > that the pickle overhead is an integral part of the IPC ovearhead, but i > will argue that it is not. The slowness of pickle is a separate problem > alltogether.
That's interesting. I've also used multiprocessing with numpy and didn't realize that. Is this true in python3 too? In python2 it appears that multiprocessing uses pickle protocol 0 which must cause a big slowdown (a factor of 100) relative to protocol 2, and uses pickle instead of cPickle. a = np.arange(40*40) %timeit pickle.dumps(a) 1000 loops, best of 3: 1.63 ms per loop %timeit cPickle.dumps(a) 1000 loops, best of 3: 1.56 ms per loop %timeit cPickle.dumps(a, protocol=2) 100000 loops, best of 3: 18.9 µs per loop Python 3 uses protocol 3 by default: %timeit pickle.dumps(a) 10000 loops, best of 3: 20 µs per loop > 5. Share memory does not improve on the pickle overhead because also NumPy > arrays with shared memory must be pickled. Multiprocessing can bypass > pickling the RawArray object, but the rest of the NumPy array is pickled. > Using shared memory arrays have no speed advantage over normal NumPy arrays > when we use multiprocessing. > > 6. It is much easier to write concurrent code that uses queues for message > passing than anything else. That is why using a Queue object has been the > popular Pythonic approach to both multitreading and multiprocessing. I > would like this to continue. > > I am therefore focusing my effort on the multiprocessing.Queue object. If > you understand the six points I listed you will see where this is going: > What we really need is a specialized queue that has knowledge about NumPy > arrays and can bypass pickle. I am therefore focusing my efforts on > creating a NumPy aware queue object. > > We are not doing the users a favor by encouraging the use of shared memory > arrays. They help with nothing. > > > Sturla Molden _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion