[issue12086] Tutorial doesn't discourage name mangling

2011-05-16 Thread Radomir Dopieralski

New submission from Radomir Dopieralski :

In the tutorial, at 
http://docs.python.org/tutorial/classes.html#private-variables you can read:


9.6. Private Variables
“Private” instance variables that cannot be accessed except from inside an 
object don’t exist in Python. However, there is a convention that is followed 
by most Python code: a name prefixed with an underscore (e.g. _spam) should be 
treated as a non-public part of the API (whether it is a function, a method or 
a data member). It should be considered an implementation detail and subject to 
change without notice.

Since there is a valid use-case for class-private members (namely to avoid name 
clashes of names with names defined by subclasses), there is limited support 
for such a mechanism, called name mangling. Any identifier of the form __spam 
(at least two leading underscores, at most one trailing underscore) is 
textually replaced with _classname__spam, where classname is the current class 
name with leading underscore(s) stripped. This mangling is done without regard 
to the syntactic position of the identifier, as long as it occurs within the 
definition of a class.

Note that the mangling rules are designed mostly to avoid accidents; it still 
is possible to access or modify a variable that is considered private. This can 
even be useful in special circumstances, such as in the debugger.

[...]


I think that this section doesn't stress enough how special the "__foo" syntax 
is and how rarely it should be used. If I was a programmer coming from Java to 
Python, I would start using "__foo" everywhere after reading this. I actually 
receive code written like that from programmers new to Python, and they point 
to that section of documentation when I ask why they did it.

At minimum, I'd add a paragraph that warns about how name mangling makes the 
code hard to reuse, difficult to test and unpleasant to debug.

--
assignee: docs@python
components: Documentation
messages: 136072
nosy: docs@python, sheep
priority: normal
severity: normal
status: open
title: Tutorial doesn't discourage name mangling
type: feature request
versions: Python 2.7

___
Python tracker 
<http://bugs.python.org/issue12086>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12086] Tutorial doesn't discourage name mangling

2011-05-16 Thread Radomir Dopieralski

Radomir Dopieralski  added the comment:

"In the unlikely case that you specifically need to avoid name clashes with 
subclasses, there is limited support..." ;)

--

___
Python tracker 
<http://bugs.python.org/issue12086>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12086] Tutorial doesn't discourage name mangling

2011-05-17 Thread Radomir Dopieralski

Radomir Dopieralski  added the comment:

I am reporting this specifically because I just had two independent cases of 
people who submitted code that had almost all methods name-mangled (within 2 
weeks), and who then pointed to that section of the tutorial as justification. 
I have a hard time convincing them that it is a bad idea, as I have to work 
against the official documentation here. 

I agree that the language and library references should explain the mechanics 
behind the language in a neutral and empowering way. But I think that tutorial 
shouldn't tell people to write horrible code.

Perhaps it would suffice if the tutorial didn't call this "private methods"? A 
more descriptive and accurate section name, such as "name mangling" or 
"avoiding name clashes" could help a lot.

--

___
Python tracker 
<http://bugs.python.org/issue12086>
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19859] functools.lru_cache keeps objects alive forever

2013-12-02 Thread Radomir Dopieralski

New submission from Radomir Dopieralski:

As most naïve "memoized" decorator implementations, lru_cache keeps references 
to all the values of arguments of the decorated function in the cache. That 
means, that if we call such a decorated function with an object as a parameter, 
that object will be kept alive in memory forever -- that is, until the program 
ends. This is an obvious waste, since when we no longer have any other 
reference to that object, we are unable to call that function with the same 
parameter ever again, so it just wastes the cache space.

This is a very common case when we decorate a method -- the first parameter is 
"self". One solution for this particular case is to use a dedicated 
"memoized_method" decorator, that stores the cache on the "self" object itself, 
so that it can be released together with the object.

A more general solution uses weakrefs where possible in the cache key, maybe 
even with an additional callback that removes the cache entry when any of its 
parameters is dead. Obviously it adds some overhead and makes the caching 
decorator even slower, but it can let us save a lot of memory, especially in 
long-running applications.

To better illustrate what I mean, here is an example of such an improved 
@memoized decorator that I wrote: 
https://review.openstack.org/#/c/54117/5/horizon/utils/memoized.py

It would be great to have an option to do something similar with lru_cache, and 
if there is an interest, I would like to work on that.

--
components: Library (Lib)
messages: 204995
nosy: thesheep
priority: normal
severity: normal
status: open
title: functools.lru_cache keeps objects alive forever
type: resource usage
versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5

___
Python tracker 
<http://bugs.python.org/issue19859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19859] functools.lru_cache keeps objects alive forever

2013-12-03 Thread Radomir Dopieralski

Radomir Dopieralski added the comment:

I prepared a proof of concept solution at:

https://bitbucket.org/thesheep/cpython-lru_cache-weakref/commits/66c1c9f3256785552224ca177ed77a8312de6bb8

--
hgrepos: +215

___
Python tracker 
<http://bugs.python.org/issue19859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19859] functools.lru_cache keeps objects alive forever

2013-12-03 Thread Radomir Dopieralski

Radomir Dopieralski added the comment:

The method example is just the most common case where this problem can be 
easily seen, but not the only one. We indeed use the @cached_property decorator 
on properties (similar to 
https://github.com/mitsuhiko/werkzeug/blob/master/werkzeug/utils.py#L35), and 
I've actually written a @memoized_method decorator for methods, that do pretty 
much what Serhiy suggests (except it's not repeated all over the place, a la 
copy-pasta, but kept in one decorator). That is fine, and the @cached_property 
is actually superior, as it avoids a lookup and a check once the value has been 
calculated.

However, this still doesn't solve the problems that are encountered in practice 
in actual code, like here: 
https://github.com/openstack/horizon/blob/master/openstack_dashboard/api/neutron.py#L735

Here we have a normal function, not a method, that calls a remote API through 
HTTP (the call is relatively slow, so we definitely want to cache it for 
multiple invocations). The function takes a ``request`` parameter, because it 
needs it for authentication with the remote service. The problem we had is that 
this will keep every single request in memory, because it's referenced by the 
cache.

Somehow it feels wrong to store the cache on an arbitrary attribute of the 
function, like the request in this case, and it's easy to imagine a function 
that takes two such critical arguments.

This is the code that actually made me write the weakref version of the 
@memoized decorator that I linked initially, and I thought that it could also 
be useful to have that in Python's caching decorator as an option.

I can understand if you think that this is too much, and that in such tricky 
situations the programmer should either write their own caching, or rewrite the 
code to avoid a memory leak. But I am sure that most programmers will not even 
think about this caveat. I think that we should at least add a note to 
lru_cache's documentation warning about this scenario and advising them to 
write their own caching decorators.

--

___
Python tracker 
<http://bugs.python.org/issue19859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19859] functools.lru_cache keeps objects alive forever

2013-12-03 Thread Radomir Dopieralski

Radomir Dopieralski added the comment:

Actually, after looking closer, my @memoize_method decorator does something 
completely different than Serhiy suggested. Still it only solves the common 
case of methods, and does nothing if you pass your short-lived objects as other 
parameters than self.

Limiting the cache size is also not a solution in the practical example with 
request that I linked to in the previous comment, because we can't know in 
advance how many times per request the function is going to be called, picking 
an arbitrary number feels wrong and may lead to unexpected behaviors when other 
fragments of code change (like a sudden slowdown every N calls).

--

___
Python tracker 
<http://bugs.python.org/issue19859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19859] functools.lru_cache keeps objects alive forever

2013-12-04 Thread Radomir Dopieralski

Radomir Dopieralski added the comment:

Thank you for your attention. I'm actually quite happy with the solution we 
have, it works well. That's actually I thought that it may be worthwhile to try 
and push it upstream to Python. I can totally understand why you don't want to 
add too much to the standard library, after all, everything you add there has 
to stay forever. So please consider this patch abandoned.

But I think it's would be still worthwhile to add a note to the lru_cache's 
documentation, saying something like:

"""
Warning! lru_cache will keep references to all the arguments for which it keeps 
cached values, which prevents them from being freed from memory when there are 
no other references. This can lead to memory leaks when you call a function 
with lru_cache on a lot of short-lived objects.
"""

I suppose you can come up with a nicer phrasing.

--

___
Python tracker 
<http://bugs.python.org/issue19859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19859] functools.lru_cache keeps objects alive forever

2013-12-04 Thread Radomir Dopieralski

Radomir Dopieralski added the comment:

> Umm, that's part of the operational definition of a value based cache
> - it needs to keep things alive, so that if a different instance shows
> up with the same value, it will still get a cache hit.

If it only kept the return value alive, that wouldn't be a problem, it's indeed 
intuitively obvious that it has to do that in order to work. But what many 
people miss to notice is that it also keeps any arguments that were passed to 
the function alive. This is not intuitive, and as I demonstrated with my patch, 
not even necessary, so I think it might be worthwhile to at least mention this 
little implementation quirk.

--

___
Python tracker 
<http://bugs.python.org/issue19859>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com