I see now that my previous reply went only to Stefan, so I'm re-submitting,
this time to the list.

> -----Original Message-----
> From: Stefan Behnel
> Sent: Saturday, 04 September, 2010 04:29
>
> What about adding an intermediate namespace called "cache", so that 
> the new operations are available like this:
> 
>      print get_phone_number.cache.hits
>      get_phone_number.cache.clear()

I agree. While the function-based implementation is highly efficient, the
pure use of functions has the counter-Pythonic effect of obfuscating the
internal state (the same way the 'private' keyword does in Java). A
class-based implementation would be capable of having its state introspected
and could easily be extended. While the functional implementation is a
powerful construct, it fails to generalize well. IMHO, a stdlib
implementation should err on the side of transparency and extensibility over
performance.

That said, I've adapted Hettinger's Python 2.5 implementation to a
class-based implementation. I've tried to keep the performance optimizations
in place, but instead of instrumenting the wrapped method with lots of
cache_* functions, I simply attach the cache object itself, which then
provides the interface suggested by Stefan. This technique allows access to
the cache object and all of its internal state, so it's also possible to do
things like:

    get_phone_number.cache.maxsize += 100

or

    if get_phone_number.cache.store:
        do_something_interesting()

These techniques are nearly impossible in the functional implementation, as
the state is buried in the locals() of the nested functions.

I'm most grateful to Raymond for contributing this to Python; On many
occasions, I've used the ActiveState recipes for simple caches, but in
almost every case, I've had to adapt the implementation to provide more
transparency. I'd prefer to not have to do the same with the stdlib.

Regards,
Jason R. Coombs

# modified from 
http://code.activestate.com/recipes/498245-lru-and-lfu-cache-decorators/

import collections
import functools
from itertools import ifilterfalse
from heapq import nsmallest
from operator import itemgetter

class Counter(dict):
        'Mapping where default values are zero'
        def __missing__(self, key):
                return 0

class LRUCache(object):
        '''
        Least-recently-used cache decorator.

        Arguments to the cached function must be hashable.
        Cache performance statistics stored in .hits and .misses.
        Clear the cache with .clear().
        Cache object can be accessed as f.cache, where f is the decorated
         function.
        http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used

        '''
        
        def __init__(self, maxsize=100):
                self.maxsize = maxsize
                self.maxqueue = maxsize*10
                self.store = dict()                                     # 
mapping of args to results
                self.queue = collections.deque()        # order that keys have 
been used
                self.refcount = Counter()                       # times each 
key is in the queue

                self.hits = self.misses = 0

        def decorate(self, user_function,
                        len=len, iter=iter, tuple=tuple, sorted=sorted,
                        KeyError=KeyError):

                sentinel = object()     # marker for looping around the queue
                kwd_mark = object()     # separate positional and keyword 

                # lookup optimizations (ugly but fast)
                store = self.store
                queue = self.queue
                refcount = self.refcount
                queue_append, queue_popleft = queue.append, queue.popleft
                queue_appendleft, queue_pop = queue.appendleft, queue.pop

                @functools.wraps(user_function)
                def wrapper(*args, **kwds):
                        # cache key records both positional and keyword args
                        key = args
                        if kwds:
                                key += (kwd_mark,) + tuple(sorted(kwds.items()))

                        # record recent use of this key
                        queue_append(key)
                        refcount[key] += 1

                        # get cache entry or compute if not found
                        try:
                                result = store[key]
                                self.hits += 1
                        except KeyError:
                                result = user_function(*args, **kwds)
                                store[key] = result
                                self.misses += 1

                                # purge least recently used cache entry
                                if len(store) > self.maxsize:
                                        key = queue_popleft()
                                        refcount[key] -= 1
                                        while refcount[key]:
                                                key = queue_popleft()
                                                refcount[key] -= 1
                                        del store[key], refcount[key]

                        # periodically compact the queue by eliminating 
duplicate keys
                        # while preserving order of most recent access
                        if len(queue) > self.maxqueue:
                                refcount.clear()
                                queue_appendleft(sentinel)
                                for key in ifilterfalse(refcount.__contains__,
                                                                                
iter(queue_pop, sentinel)):
                                        queue_appendleft(key)
                                        refcount[key] = 1


                        return result
                wrapper.cache = self
                return wrapper

        def clear(self):
                self.store.clear()
                self.queue.clear()
                self.refcount.clear()
                self.hits = self.misses = 0

        __call__ = decorate

class LFUCache(object):
        '''Least-frequenty-used cache decorator.

        Arguments to the cached function must be hashable.
        Cache performance statistics stored in .hits and .misses.
        Clear the cache with .clear().
        Cache object can be accessed as f.cache, where f is the decorated
         function.
        http://en.wikipedia.org/wiki/Least_Frequently_Used

        '''
        
        def __init__(self, maxsize=100):
                self.maxsize = maxsize
                self.store = dict()                     # mapping of args to 
results
                self.use_count = Counter()      # times each key has been 
accessed
                self.hits = self.misses = 0

        def decorate(self, user_function):
                kwd_mark = object()             # separate positional and 
keyword args

                # lookup optimizations (ugly but fast)
                store = self.store
                use_count = self.use_count

                @functools.wraps(user_function)
                def wrapper(*args, **kwds):

                        key = args
                        if kwds:
                                key += (kwd_mark,) + tuple(sorted(kwds.items()))
                        use_count[key] += 1

                        # get cache entry or compute if not found
                        try:
                                result = store[key]
                                self.hits += 1
                        except KeyError:
                                result = user_function(*args, **kwds)
                                store[key] = result
                                self.misses += 1

                                # purge least frequently used cache entry
                                if len(store) > self.maxsize:
                                        for key, _ in nsmallest(self.maxsize // 
10,
                                                                                
        use_count.iteritems(),
                                                                                
        key=itemgetter(1)):
                                                del store[key], use_count[key]

                        return result
                wrapper.cache = self
                return wrapper

        __call__ = decorate

        def clear(self):
                self.store.clear()
                self.use_count.clear()
                self.hits = self.misses = 0


if __name__ == '__main__':

        @LRUCache(maxsize=20)
        def f(x, y):
                return 3*x+y

        domain = range(5)
        from random import choice
        for i in range(1000):
                r = f(choice(domain), choice(domain))

        print(f.cache.hits, f.cache.misses)

        @LFUCache(maxsize=20)
        def f(x, y):
                return 3*x+y

        domain = range(5)
        from random import choice
        for i in range(1000):
                r = f(choice(domain), choice(domain))

        print(f.cache.hits, f.cache.misses)

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to