Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Alexandre Vassalotti
On Tue, Jan 27, 2009 at 5:16 PM, Jake McGuire wrote: > Another vaguely related change would be to store string and unicode objects > in the pickler memo keyed as themselves rather than their object ids. That wouldn't be difficult to do--i.e., simply add a type check in Pickler.memoize and another

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jake McGuire
On Jan 27, 2009, at 12:39 PM, Martin v. Löwis wrote: I may have misunderstood how unpickling works Perhaps I have misunderstood your patch. Posting it to Rietveld might also be useful. It is not immediately clear to me how Rietveld works. But I have created an issue on tracker: http://bu

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> I may have misunderstood how unpickling works Perhaps I have misunderstood your patch. Posting it to Rietveld might also be useful. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Un

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jake McGuire
On Jan 27, 2009, at 11:40 AM, Martin v. Löwis wrote: Hm. This would change the pickling format though. Wouldn't just interning (short) strings on unpickling be simpler? Sure - that's what Jake had proposed. However, it is always difficult to select which strings to intern - his heuristics (IIUC

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> Just set a size limit, e.g. 30 or 100. It's just a heuristic. I > believe somewhere in Python itself I intern string literals if they > are reasonably short and fit the pattern of an identifier; I'd worry > that the pattern matching would slow down unpickling more than the > expected benefit thou

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 11:40 AM, "Martin v. Löwis" wrote: >> Hm. This would change the pickling format though. Wouldn't just >> interning (short) strings on unpickling be simpler? > > Sure - that's what Jake had proposed. However, it is always difficult > to select which strings to intern - his h

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> Hm. This would change the pickling format though. Wouldn't just > interning (short) strings on unpickling be simpler? Sure - that's what Jake had proposed. However, it is always difficult to select which strings to intern - his heuristics (IIUC) is to intern all strings that appear as dictionary

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 10:43 AM, "Martin v. Löwis" wrote: >> Interning the strings on unpickling makes the pickles smaller, and at >> least for cPickle actually makes unpickling sequences of many objects >> slightly faster. I have included proposed patches to cPickle.c and >> pickle.py, and woul

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Martin v. Löwis
> Interning the strings on unpickling makes the pickles smaller, and at > least for cPickle actually makes unpickling sequences of many objects > slightly faster. I have included proposed patches to cPickle.c and > pickle.py, and would appreciate any feedback. Please submit patches always to the

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Guido van Rossum
On Tue, Jan 27, 2009 at 6:23 AM, Jesse Noller wrote: > On Tue, Jan 27, 2009 at 4:49 AM, Jake McGuire wrote: >> Instance attribute names are normally interned - this is done in >> PyObject_SetAttr (among other places). Unpickling (in pickle and cPickle) >> directly updates __dict__ on the instanc

Re: [Python-Dev] undesireable unpickle behavior, proposed fix

2009-01-27 Thread Jesse Noller
On Tue, Jan 27, 2009 at 4:49 AM, Jake McGuire wrote: > Instance attribute names are normally interned - this is done in > PyObject_SetAttr (among other places). Unpickling (in pickle and cPickle) > directly updates __dict__ on the instance object. This bypasses the > interning so you end up with