Antoine Pitrou wrote: > Michael Haggerty <mhagger <at> alum.mit.edu> writes: >> It is easy to optimize the pickling of instances by giving them >> __getstate__() and __setstate__() methods. But the pickler still >> records the type of each object (essentially, the name of its class) in >> each record. The space for these strings constituted a large fraction >> of the database size. > > If these strings are not interned, then perhaps they should be. > There is a similar optimization proposal (w/ patch) for attribute names: > http://bugs.python.org/issue5084
If I understand correctly, this would not help: - on writing, the strings are identical anyway, because they are read out of the class's __name__ and __module__ fields. Therefore the Pickler's usual memoizing behavior will prevent the strings from being written more than once. - on reading, the strings are only used to look up the class. Therefore they are garbage collected almost immediately. This is a different situation that that of attribute names, which are stored persistently as the keys in the instance's __dict__. Michael _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com