[issue12086] Tutorial doesn't discourage name mangling
New submission from Radomir Dopieralski : In the tutorial, at http://docs.python.org/tutorial/classes.html#private-variables you can read: 9.6. Private Variables “Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice. Since there is a valid use-case for class-private members (namely to avoid name clashes of names with names defined by subclasses), there is limited support for such a mechanism, called name mangling. Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam, where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, as long as it occurs within the definition of a class. Note that the mangling rules are designed mostly to avoid accidents; it still is possible to access or modify a variable that is considered private. This can even be useful in special circumstances, such as in the debugger. [...] I think that this section doesn't stress enough how special the "__foo" syntax is and how rarely it should be used. If I was a programmer coming from Java to Python, I would start using "__foo" everywhere after reading this. I actually receive code written like that from programmers new to Python, and they point to that section of documentation when I ask why they did it. At minimum, I'd add a paragraph that warns about how name mangling makes the code hard to reuse, difficult to test and unpleasant to debug. -- assignee: docs@python components: Documentation messages: 136072 nosy: docs@python, sheep priority: normal severity: normal status: open title: Tutorial doesn't discourage name mangling type: feature request versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue12086> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12086] Tutorial doesn't discourage name mangling
Radomir Dopieralski added the comment: "In the unlikely case that you specifically need to avoid name clashes with subclasses, there is limited support..." ;) -- ___ Python tracker <http://bugs.python.org/issue12086> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12086] Tutorial doesn't discourage name mangling
Radomir Dopieralski added the comment: I am reporting this specifically because I just had two independent cases of people who submitted code that had almost all methods name-mangled (within 2 weeks), and who then pointed to that section of the tutorial as justification. I have a hard time convincing them that it is a bad idea, as I have to work against the official documentation here. I agree that the language and library references should explain the mechanics behind the language in a neutral and empowering way. But I think that tutorial shouldn't tell people to write horrible code. Perhaps it would suffice if the tutorial didn't call this "private methods"? A more descriptive and accurate section name, such as "name mangling" or "avoiding name clashes" could help a lot. -- ___ Python tracker <http://bugs.python.org/issue12086> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19859] functools.lru_cache keeps objects alive forever
New submission from Radomir Dopieralski: As most naïve "memoized" decorator implementations, lru_cache keeps references to all the values of arguments of the decorated function in the cache. That means, that if we call such a decorated function with an object as a parameter, that object will be kept alive in memory forever -- that is, until the program ends. This is an obvious waste, since when we no longer have any other reference to that object, we are unable to call that function with the same parameter ever again, so it just wastes the cache space. This is a very common case when we decorate a method -- the first parameter is "self". One solution for this particular case is to use a dedicated "memoized_method" decorator, that stores the cache on the "self" object itself, so that it can be released together with the object. A more general solution uses weakrefs where possible in the cache key, maybe even with an additional callback that removes the cache entry when any of its parameters is dead. Obviously it adds some overhead and makes the caching decorator even slower, but it can let us save a lot of memory, especially in long-running applications. To better illustrate what I mean, here is an example of such an improved @memoized decorator that I wrote: https://review.openstack.org/#/c/54117/5/horizon/utils/memoized.py It would be great to have an option to do something similar with lru_cache, and if there is an interest, I would like to work on that. -- components: Library (Lib) messages: 204995 nosy: thesheep priority: normal severity: normal status: open title: functools.lru_cache keeps objects alive forever type: resource usage versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5 ___ Python tracker <http://bugs.python.org/issue19859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19859] functools.lru_cache keeps objects alive forever
Radomir Dopieralski added the comment: I prepared a proof of concept solution at: https://bitbucket.org/thesheep/cpython-lru_cache-weakref/commits/66c1c9f3256785552224ca177ed77a8312de6bb8 -- hgrepos: +215 ___ Python tracker <http://bugs.python.org/issue19859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19859] functools.lru_cache keeps objects alive forever
Radomir Dopieralski added the comment: The method example is just the most common case where this problem can be easily seen, but not the only one. We indeed use the @cached_property decorator on properties (similar to https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/utils.py#L35), and I've actually written a @memoized_method decorator for methods, that do pretty much what Serhiy suggests (except it's not repeated all over the place, a la copy-pasta, but kept in one decorator). That is fine, and the @cached_property is actually superior, as it avoids a lookup and a check once the value has been calculated. However, this still doesn't solve the problems that are encountered in practice in actual code, like here: https://github.com/openstack/horizon/blob/master/openstack_dashboard/api/neutron.py#L735 Here we have a normal function, not a method, that calls a remote API through HTTP (the call is relatively slow, so we definitely want to cache it for multiple invocations). The function takes a ``request`` parameter, because it needs it for authentication with the remote service. The problem we had is that this will keep every single request in memory, because it's referenced by the cache. Somehow it feels wrong to store the cache on an arbitrary attribute of the function, like the request in this case, and it's easy to imagine a function that takes two such critical arguments. This is the code that actually made me write the weakref version of the @memoized decorator that I linked initially, and I thought that it could also be useful to have that in Python's caching decorator as an option. I can understand if you think that this is too much, and that in such tricky situations the programmer should either write their own caching, or rewrite the code to avoid a memory leak. But I am sure that most programmers will not even think about this caveat. I think that we should at least add a note to lru_cache's documentation warning about this scenario and advising them to write their own caching decorators. -- ___ Python tracker <http://bugs.python.org/issue19859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19859] functools.lru_cache keeps objects alive forever
Radomir Dopieralski added the comment: Actually, after looking closer, my @memoize_method decorator does something completely different than Serhiy suggested. Still it only solves the common case of methods, and does nothing if you pass your short-lived objects as other parameters than self. Limiting the cache size is also not a solution in the practical example with request that I linked to in the previous comment, because we can't know in advance how many times per request the function is going to be called, picking an arbitrary number feels wrong and may lead to unexpected behaviors when other fragments of code change (like a sudden slowdown every N calls). -- ___ Python tracker <http://bugs.python.org/issue19859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19859] functools.lru_cache keeps objects alive forever
Radomir Dopieralski added the comment: Thank you for your attention. I'm actually quite happy with the solution we have, it works well. That's actually I thought that it may be worthwhile to try and push it upstream to Python. I can totally understand why you don't want to add too much to the standard library, after all, everything you add there has to stay forever. So please consider this patch abandoned. But I think it's would be still worthwhile to add a note to the lru_cache's documentation, saying something like: """ Warning! lru_cache will keep references to all the arguments for which it keeps cached values, which prevents them from being freed from memory when there are no other references. This can lead to memory leaks when you call a function with lru_cache on a lot of short-lived objects. """ I suppose you can come up with a nicer phrasing. -- ___ Python tracker <http://bugs.python.org/issue19859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19859] functools.lru_cache keeps objects alive forever
Radomir Dopieralski added the comment: > Umm, that's part of the operational definition of a value based cache > - it needs to keep things alive, so that if a different instance shows > up with the same value, it will still get a cache hit. If it only kept the return value alive, that wouldn't be a problem, it's indeed intuitively obvious that it has to do that in order to work. But what many people miss to notice is that it also keeps any arguments that were passed to the function alive. This is not intuitive, and as I demonstrated with my patch, not even necessary, so I think it might be worthwhile to at least mention this little implementation quirk. -- ___ Python tracker <http://bugs.python.org/issue19859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com