On Thu, Oct 18, 2012 at 1:16 AM, Dave Angel <d...@davea.name> wrote: > > may decide to reuse existing objects. This includes, ints, floats, > strings, byte strings, tuples, etc. In the particular case of CPython, > small integers are cached in this way, and so are short strings with no > whitespace. How small, and exactly which strings is irrelevant to me, > and hopefully to you. The point is you cannot be sure whether equal > immutable objects are really just a single one, or not.
In case anyone is curious about implementation trivia, CPython (2.7.3 and 3.2.3) caches integers in the closed range -5 to 256. This doesn't apply to the long() type in 2.x. Strings for attributes, and various other cases, are interned in a dict. code objects intern a limited set of string constants (alphanumeric/underscore ASCII) and also all referenced names (co_names, co_varnames, co_freevars, and co_cellvars). The built-in function intern() (moved to sys.intern in 3.x) manually interns a string, e.g. for a small efficiency gain with dict lookups: http://docs.python.org/py3k/library/sys#sys.intern http://docs.python.org/library/functions.html#intern To give a feel for how much string interning is used, here's the summary after importing ctypes in a fresh interpreter and releasing the interned strings: 3.2.3: >>> import ctypes >>> ctypes.pythonapi._Py_ReleaseInternedUnicodeStrings() releasing 3996 interned strings total size of all interned strings: 34193/0 mortal/immortal 2.7.3: >>> import ctypes >>> ctypes.pythonapi._Py_ReleaseInternedStrings() releasing 2875 interned strings total size of all interned strings: 26389/0 mortal/immortal _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor