Re: [Numpy-discussion] About the npz format

2014-07-06 Thread Sturla Molden
There is no os.mkfifo on Windows. Sturla Valentin Haenel wrote: > sorry, for the top-post, but should we add this as an issue on the > github tracker? I'd like to revisit it this summer. > > V- > > * Julian Taylor [2014-04-18]: >> On 18.04.2014 18:29, Valentin Haenel wrote: >>> Hi, >>> >>> *

Re: [Numpy-discussion] About the npz format

2014-07-04 Thread Valentin Haenel
sorry, for the top-post, but should we add this as an issue on the github tracker? I'd like to revisit it this summer. V- * Julian Taylor [2014-04-18]: > On 18.04.2014 18:29, Valentin Haenel wrote: > > Hi, > > > > * Valentin Haenel [2014-04-17]: > >> * Valentin Haenel [2014-04-17]: > >>> * Ju

Re: [Numpy-discussion] About the npz format

2014-04-18 Thread Julian Taylor
On 18.04.2014 18:29, Valentin Haenel wrote: > Hi, > > * Valentin Haenel [2014-04-17]: >> * Valentin Haenel [2014-04-17]: >>> * Julian Taylor [2014-04-17]: On 17.04.2014 21:30, onefire wrote: > Thanks for the suggestion. I did profile the program before, just not > using Python. >>>

Re: [Numpy-discussion] About the npz format

2014-04-18 Thread Valentin Haenel
Hi, * Valentin Haenel [2014-04-17]: > * Valentin Haenel [2014-04-17]: > > * Julian Taylor [2014-04-17]: > > > On 17.04.2014 21:30, onefire wrote: > > > > Thanks for the suggestion. I did profile the program before, just not > > > > using Python. > > > > > > one problem of npz is that the zipfi

Re: [Numpy-discussion] About the npz format

2014-04-18 Thread Francesc Alted
El 18/04/14 13:01, Valentin Haenel ha escrit: > Hi again, > > * onefire [2014-04-18]: >> I think your workaround might help, but a better solution would be to not >> use Python's zipfile module at all. This would make it possible to, say, >> let the user choose the checksum algorithm or to turn th

Re: [Numpy-discussion] About the npz format

2014-04-18 Thread Valentin Haenel
Hi again, * onefire [2014-04-18]: > I think your workaround might help, but a better solution would be to not > use Python's zipfile module at all. This would make it possible to, say, > let the user choose the checksum algorithm or to turn that off. > Or maybe the compression stuff makes this ro

Re: [Numpy-discussion] About the npz format

2014-04-18 Thread Valentin Haenel
Hi Gilberto, * onefire [2014-04-18]: > Interesting! Using sync() as you suggested makes every write slower, and > it decreases the time difference between save and savez, > so maybe I was observing the 10 times difference because the file system > buffers were being flushed immediately after a c

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread onefire
I found this github issue (https://github.com/numpy/numpy/pull/3465) where someone mentions the idea of forking the zip library. Gilberto On Thu, Apr 17, 2014 at 8:09 PM, onefire wrote: > Interesting! Using sync() as you suggested makes every write slower, and > it decreases the time differen

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread onefire
Interesting! Using sync() as you suggested makes every write slower, and it decreases the time difference between save and savez, so maybe I was observing the 10 times difference because the file system buffers were being flushed immediately after a call to savez, but not right after a call to np.

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Valentin Haenel
Hello, * Valentin Haenel [2014-04-17]: > As part of bloscpack.sysutil I have wrapped this to be available from > Python (needs root though). So, to re-rurn the benchmarks, doing each > one twice: Actually, I just realized, that doing a ``sync`` doesn't require root. my bad, V-

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Valentin Haenel
Hi, * Valentin Haenel [2014-04-17]: > * Valentin Haenel [2014-04-17]: > > * Valentin Haenel [2014-04-17]: > > > Hi, > > > > > > * Julian Taylor [2014-04-17]: > > > > On 17.04.2014 21:30, onefire wrote: > > > > > Hi Nathaniel, > > > > > > > > > > Thanks for the suggestion. I did profile the pro

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Valentin Haenel
* Valentin Haenel [2014-04-17]: > * Valentin Haenel [2014-04-17]: > > Hi, > > > > * Julian Taylor [2014-04-17]: > > > On 17.04.2014 21:30, onefire wrote: > > > > Hi Nathaniel, > > > > > > > > Thanks for the suggestion. I did profile the program before, just not > > > > using Python. > > > > >

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Valentin Haenel
* Valentin Haenel [2014-04-17]: > Hi, > > * Julian Taylor [2014-04-17]: > > On 17.04.2014 21:30, onefire wrote: > > > Hi Nathaniel, > > > > > > Thanks for the suggestion. I did profile the program before, just not > > > using Python. > > > > one problem of npz is that the zipfile module does n

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Valentin Haenel
Hi, * Julian Taylor [2014-04-17]: > On 17.04.2014 21:30, onefire wrote: > > Hi Nathaniel, > > > > Thanks for the suggestion. I did profile the program before, just not > > using Python. > > one problem of npz is that the zipfile module does not support streaming > data in (or if it does now we

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Valentin Haenel
Hi again, * David Palao [2014-04-17]: > 2014-04-16 20:26 GMT+02:00 onefire : > > Hi all, > > > > I have been playing with the idea of using Numpy's binary format as a > > lightweight alternative to HDF5 (which I believe is the "right" way to do if > > one does not have a problem with the dependen

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Julian Taylor
On 17.04.2014 21:30, onefire wrote: > Hi Nathaniel, > > Thanks for the suggestion. I did profile the program before, just not > using Python. one problem of npz is that the zipfile module does not support streaming data in (or if it does now we aren't using it). So numpy writes the file uncompres

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread onefire
Hi Nathaniel, Thanks for the suggestion. I did profile the program before, just not using Python. But following your suggestion, I used %prun. Here's (part of) the output (when I use savez): 195503 function calls in 4.466 seconds Ordered by: internal time ncalls tottime percall cumti

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread Nathaniel Smith
On 17 Apr 2014 01:57, "onefire" wrote: > > What I cannot understand is why savez takes more than 10 times longer than saving the data to a npy file. The only reason that I could come up with was the computation of the crc32. We can all make guesses but the solution is just to profile it :-). %pru

Re: [Numpy-discussion] About the npz format

2014-04-17 Thread David Palao
2014-04-16 20:26 GMT+02:00 onefire : > Hi all, > > I have been playing with the idea of using Numpy's binary format as a > lightweight alternative to HDF5 (which I believe is the "right" way to do if > one does not have a problem with the dependency). > > I am pretty happy with the npy format, but

Re: [Numpy-discussion] About the npz format

2014-04-16 Thread onefire
Valentin Haenel, Bloscpack definitely looks interesting but I need to take a careful look first. I will let you know if I like it. Thanks for the suggestion! I think you and Nathaniel Smith misunderstood my questions (my fault, since I did not explain myself well!). First, Numpy's savez will not d

Re: [Numpy-discussion] About the npz format

2014-04-16 Thread Nathaniel Smith
crc32 extremely fast, and I think zip might use adler32 instead which is even faster. OTOH compression is incredibly slow, unless you're using one of the 'just a little bit of compression' formats like blosc or lzo1. If your npz files are compressed then this is certainly the culprit. The zip form

Re: [Numpy-discussion] About the npz format

2014-04-16 Thread Valentin Haenel
Hi Gilberto, * onefire [2014-04-16]: > I have been playing with the idea of using Numpy's binary format as a > lightweight alternative to HDF5 (which I believe is the "right" way to do > if one does not have a problem with the dependency). > > I am pretty happy with the npy format, but the npz

[Numpy-discussion] About the npz format

2014-04-16 Thread onefire
Hi all, I have been playing with the idea of using Numpy's binary format as a lightweight alternative to HDF5 (which I believe is the "right" way to do if one does not have a problem with the dependency). I am pretty happy with the npy format, but the npz format seems to be broken as far as perfo