On 02/10/2013 09:32 AM, Walter Prins wrote:
Hello,

I have a program where I'm overriding the retrieval of items from a list.
  As background: The data held by the lists are calculated but then read
potentially many times thereafter, so in order to prevent needless
re-calculating the same value over and over, and to remove checking/caching
code from the calculation logic code, I therefore created a subclass of
list that will automatically calculate the value in a given slot
automatically if not yet calculated. (So differently put, I'm implemented a
kind of list specific caching/memoization with the intent that it should be
transparent to the client code.)

The way I've implemented this so far was to simply override
list.__getitem__(self, key) to check if the value needs to be calculated or
not and call a calculation method if required, after which the value is
returned as normal.  On subsequent calls __getitem__ then directly returns
the value without calculating it again.

This worked mostly fine, however yesterday I ran into a slightly unexpected
problem when I found that when the list contents is iterated over and
values retrieved that way rather than via [], then __getitem__ is in fact
*not* called on the list to read the item values from the list, and
consequently I get back the "not yet calculated" entries in the list,
without the calculation routine being automatically called as is intended.

Here's a test application that demonstrates the issue:

class NotYetCalculated:
     pass

class CalcList(list):
     def __init__(self, calcitem):
         super(CalcList, self).__init__()
         self.calcitem = calcitem

     def __getitem__(self, key):
         """Override __getitem__ to call self.calcitem() if needed"""
         print "CalcList.__getitem__(): Enter"
         value = super(CalcList, self).__getitem__(key)
         if value is NotYetCalculated:
             print "CalcList.__getitem__(): calculating"
             value = self.calcitem(key)
             self[key] = value
         print "CalcList.__getitem__(): return"
         return value

def calcitem(key):
     # Demo: return square of index
     return key*key


def main():
     # Create a list that calculates its contents via a given
     # method/fn onece only
     l = CalcList(calcitem)
     # Extend with  few entries to demonstrate issue:
     l.extend([NotYetCalculated, NotYetCalculated, NotYetCalculated,
               NotYetCalculated])

     print "1) Directly getting values from list works as expected:
__getitem__ is called:"
     print "Retrieving value [2]:\n", l[2]
     print
     print "Retrieving value [3]:\n", l[3]
     print
     print "Retrieving value [2] again (no calculation this time):\n", l[2]
     print

     print "Retrieving values via an iterator doesn't work as expected:"
     print "(__getitem__ is not called and the code returns "
     print " NotYetCalcualted entries without calling __getitem__. How do I
fix this?)"
     print "List contents:"
     for x in l: print x


if __name__ == "__main__":
     main()

To reiterate:

What should happen:  In test 2) above all entries should be automatically
calculated and output should be numbers only.

What actually happens: In test 2) above the first 2 list entries
corresponding to list indexes 0 and 1 are output as "NotYetCalculated" and
calcitem is not called when required.

What's the best way to fix this problem?  Do I need to maybe override
another method, perhaps provide my own iterator implementation?  For that
matter, why doesn't iterating over the list contents fall back to calling
__getitem__?


Implement your own __iter__() special method.

And consider whether you might need __setitem__(), __len__(), __setslice__(), __getslice__() and others.

Maybe you'd be better off not inheriting from list at all, and just having an attribute that's a list. It doesn't sound like you're defining a very big subset of list, and overriding the methods you *don't* want seems to be more work than just implementing the ones you do.

A separate question: is this likely to be a sparse list? If it's very sparse, perhaps you'd consider using a dict, rather than a list attribute.



--
DaveA
_______________________________________________
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor

Reply via email to