> Emphasis on "might". Unless you have timed the code with or without a > lookup > table, you're just guessing whether it is an optimization or a > pessimization. >
See my earlier reply to Oscar's mail. I used cProfile and memoizing was almost twice as fast in the fastsest implementation. > On the other hand, my intuition is that it *will* benefit from memoization, > so my guess is the same as your guess :) > > >> On the other hand, it might complicate the code too much, so it might >> be better to Keep It Simple (KIS). Is the code below a sound approach? > > > No. You should separate the cache code from the non-cache code. Decorators > are ideal for that, but even without decorators you should split the code. > See below. Yes, it feels a lot less mentally straining to read the code where memoizing and the actual job are separated. > Also, you should have some way to stop the lookup table from growing forever. > If you are running Python 3.3, you can use functools.lru_cache, which > implements a Least Recently Used cache. Once the cache reaches a certain size, > the element which was least recently used is flushed. > I read about that. Time to upgrade! Wish I could use this in the office! <snip> > class Test(object): > > def __init__(self): > self._cache = {} > > def clear_cache(self): > self._cache.clear() > > def spss2strDate(self, gregorianDate, fmt="%Y-%m-%d", > recodeSysmisTo=""): > # cached wrapper > t = (gregorianDate, fmt, recodeSysmisTo) > if t in self._cache: > # cache hit > return self._cache[t] > # otherwise cache miss > result = self._calculate_spss2strDate(*t) # this does the real work > if len(self._cache) > 1000: > self._cache.popitem() # discard an arbitrary (key,value) pair > self._cache[t] = result > return result > > def _calculate_spss2strDate(self, gregorianDate, fmt, recodeSysmisTo): > # uncached version > return some_calculation(...) > > > You can fill in the details for the uncached version :) > > One good improvement would be to add instrumentation[1] to the cache so you > can > tell whether or not the cache is worthwhile. E.g. record how many times each > set of arguments are used. If you find that hardly any triple of arguments is > used multiple times (i.e. nearly every call is unique), then the cache is a > waste of time. It would be cool if the memoizer "turned off" itself off if hardly items are retrieved from it (ie, the calls are almost always unique). stop_memoizing = (n_cached_version / float(n_newly_calculated_version)) < some_threshold and len(cache) > some_value _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor