Re: [Numpy-discussion] About the npz format

2014-04-17 Thread onefire
I found this github issue (https://github.com/numpy/numpy/pull/3465) where someone mentions the idea of forking the zip library. Gilberto On Thu, Apr 17, 2014 at 8:09 PM, onefire wrote: > Interesting! Using sync() as you suggested makes every write slower, and > it decreases th

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread onefire
Interesting! Using sync() as you suggested makes every write slower, and it decreases the time difference between save and savez, so maybe I was observing the 10 times difference because the file system buffers were being flushed immediately after a call to savez, but not right after a call to np.

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread onefire
rsion (maybe this explain David Palao's results?). However, they all point out that a significant amount of time is spent computing the crc32. Notice that prun reports that it takes 0.379 second to compute the crc32 of an array that takes 0.2 seconds to save to a npy file. I believe this is t

Re: [Numpy-discussion] About the npz format

2014-04-16 Thread onefire
the 'just a little bit of compression' formats like blosc or lzo1. If > your npz files are compressed then this is certainly the culprit. > > The zip format supports storing files without compression. Maybe what you > want is an option to use this with .npz? > > -n >

[Numpy-discussion] About the npz format

2014-04-16 Thread onefire
Hi all, I have been playing with the idea of using Numpy's binary format as a lightweight alternative to HDF5 (which I believe is the "right" way to do if one does not have a problem with the dependency). I am pretty happy with the npy format, but the npz format seems to be broken as far as perfo