Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Antoine Pitrou
Dan Gindikin gmail.com> writes: > > Antoine Pitrou pitrou.net> writes: > > Does cPickle bytecode have some kind of NOP instruction? > > You could keep track of which PUTs weren't necessary and zero them out at > > the > > end. It would be much cheaper than writing a whole other "optimized" stre

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Dan Gindikin
Antoine Pitrou pitrou.net> writes: > Does cPickle bytecode have some kind of NOP instruction? > You could keep track of which PUTs weren't necessary and zero them out at the > end. It would be much cheaper than writing a whole other "optimized" stream. For a large file, I'm not sure it is much fa

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Dan Gindikin
Collin Winter google.com> writes: > I don't think it's possible in general to remove any PUTs if the > pickle is being written to a file-like object. It is possible to reuse > a single Pickler to pickle multiple objects: this causes the Pickler's > memo dict to be shared between the objects being

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Antoine Pitrou
Collin Winter google.com> writes: > > I don't think it's possible in general to remove any PUTs if the > pickle is being written to a file-like object. Does cPickle bytecode have some kind of NOP instruction? You could keep track of which PUTs weren't necessary and zero them out at the end. It w

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Collin Winter
On Fri, Apr 23, 2010 at 1:53 PM, Alexandre Vassalotti wrote: > On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin wrote: >> This wouldn't help our use case, your code needs the entire pickle >> stream to be in memory, which in our case would be about 475mb, this >> is on top of the 300mb+ data structu

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Dan Gindikin
Alexandre Vassalotti peadrop.com> writes: > > On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin gmail.com> > wrote: > > This wouldn't help our use case, your code needs the entire pickle > > stream to be in memory, which in our case would be about 475mb, this > > is on top of the 300mb+ data struc

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin wrote: > This wouldn't help our use case, your code needs the entire pickle > stream to be in memory, which in our case would be about 475mb, this > is on top of the 300mb+ data structures that generated the pickle > stream. > In that case, the best w

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Dan Gindikin
Collin Winter google.com> writes: > I should add that, adding the necessary bookkeeping to remove only > unused PUTs (instead of the current all-or-nothing scheme) should not > be hard. I'd watch out for a further performance/memory hit; the > pickling benchmarks in the benchmark suite should help

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 3:07 PM, Collin Winter wrote: > I should add that, adding the necessary bookkeeping to remove only > unused PUTs (instead of the current all-or-nothing scheme) should not > be hard. I'd watch out for a further performance/memory hit; the > pickling benchmarks in the benchma

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Collin Winter
On Fri, Apr 23, 2010 at 11:53 AM, Collin Winter wrote: > On Fri, Apr 23, 2010 at 11:49 AM, Alexandre Vassalotti > wrote: >> On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti >> wrote: >>> Collin Winter wrote a simple optimization pass for cPickle in Unladen >>> Swallow [1]. The code reads th

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Collin Winter
On Fri, Apr 23, 2010 at 11:49 AM, Alexandre Vassalotti wrote: > On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti > wrote: >> Collin Winter wrote a simple optimization pass for cPickle in Unladen >> Swallow [1]. The code reads through the stream and remove all the >> unnecessary PUTs in-place

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Dan Gindikin
Alexandre Vassalotti peadrop.com> writes: > Just put your code on bugs.python.org and I will take a look. > Thanks, I'll put it in there. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscr

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 2:38 PM, Alexandre Vassalotti wrote: > Collin Winter wrote a simple optimization pass for cPickle in Unladen > Swallow [1]. The code reads through the stream and remove all the > unnecessary PUTs in-place. > I just noticed the code removes *all* PUT opcodes, regardless if

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Alexandre Vassalotti
On Fri, Apr 23, 2010 at 2:11 PM, Dan Gindikin wrote: > We were having performance problems unpickling a large pickle file, we were > getting 170s running time (which was fine), but 1100mb memory usage. Memory > usage ought to have been about 300mb, this was happening because of memory > fragmentat

Re: [Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Brett Cannon
On Fri, Apr 23, 2010 at 11:11, Dan Gindikin wrote: > We were having performance problems unpickling a large pickle file, we were > getting 170s running time (which was fine), but 1100mb memory usage. Memory > usage ought to have been about 300mb, this was happening because of memory > fragmentati

[Python-Dev] Unpickling memory usage problem, and a proposed solution

2010-04-23 Thread Dan Gindikin
We were having performance problems unpickling a large pickle file, we were getting 170s running time (which was fine), but 1100mb memory usage. Memory usage ought to have been about 300mb, this was happening because of memory fragmentation, due to many unnecessary "puts" in the pickle stream. We