Russell Keith-Magee wrote:
However, rather than having cached() as a factory/proxy method on the
manager, wouldn't a better approach be to make cached() a 'cache enabling'
clone method on QuerySet?

Er, yeah. That was kind of assumed. I'm still working from the assumption that we want to make .objects a queryset.

i.e, cached() has the same operation as filter() -
it returns a clone, but with caching enabled. Example usage:

- Article.objects is the only way to get a clean query set, caching disabled
by default
- Article.objects.cached() takes the base query set, and returns a clone
with caching enabled
- Article.objects.filter(...) returns a filtered, uncached query,
- Article.objects.filter(...).cached() is the same query, but with a cache.

Yes, this is exactly what I intended to convey... sorry if I didn't make myself clear...

- q = Article.objects.filter(...).cached(); q.cached() returns a second
version of the filter query, with a clean cache

Hm, not really sure if that is obvious. I was assuming cached would be something like:

def cached(self):
  if not self._cached:
        other = self.clone()
        other._cached = True
        return other
  else:
        return self

Ie, it just returns itself if it will "do". Query sets can be passed around, and the caller is intending to say "make sure this is cached".

For your intention, I imagine
p = q.clone()
p.reset_cache()

conveys the meaning a bit better. Does that make sense?

This also removes the need to proxy the QuerySet methods through the
manager. This really appeals to me, because the Article.objects.filter()
notation in the current implementation bugs me - to me, it should be
Article.objects.all().filter().

And then the .all() really starts to look meaningless.

It might also be a good idea to add a 'reset_cache()' method to allow
developers to reset the cache on a QuerySet (no-op if the query is not
cached), and maybe even a 'non_cached()' method to return a cloned QuerySet
with the cache disabled;

Yes, I think these are necessary.

Externalized caching is an interesting idea, but worries me slightly.
Caching is one of those areas where the exact behaviour that you want can be
highly application and situation dependent. I am more comfortable with
putting making the developer take responsibility for what they want to
cache, when they want to cache it, and the lifespan of that cache.
Particularly if we can make the default behaviour 'don't cache', with
caching being an opt-in behaviour (which the approach I described earlier on
this mail would do).


Also remember that related objects have a queryset attached to them - eg my_article.reporters . The assumption so far has been that this will be cached by default, ie you would use .non_cached() on them to get a non-caching version. This does make it harder to understand whether caching is on or not ( but also kind of makes sense as there are very few situations you want this uncached.)

The only thing that was assumed to be uncached was the .objects as it is in a class which sticks around.

So without externalising, the choices are:

a) inconsistency with managers and related objects having different default caching behaviour. b) the need to have a lot of .cached() being used on related objects, when that is likely to be what people want in 95% of situations.

A global cache that automagically works out the most recent version of a
query has all sorts of potential for expectation to differ from
implementation, (i.e., developer writes their app, doesn't get the result
they expect because the caching implementation cleared a cached when they
were not expecting it to be, and complains "why doesn't it work").

Externalising caching means that consistent *and* non-ugly behaviour can be offered for all "query set entry points" - managers and related objects.

The choice of which consistent behaviour is:
a) All caches last the http request. In a non http context, use these boundary functions ( or a with statement in python 2.5) . b) Caches last the length of a db transaction. In an http context, the db transaction will last the length of an http request unless explicitly changed using these view decorators.

In both:
Use .non_cached() to get a non caching query set. Use .reset_cache() to reset the cache in a query set.

To boot,
it will be a little hairy to implement, there will be cries of 'coupling',
etc... it seems like asking for a lot of headaches without a whole lot of
benefit at the end of the day.

I think it is clearly a better solution, but I don't have the time to do it myself. The first solution is better than the current situation though.

Reply via email to