Russell Keith-Magee wrote:
However, rather than having cached() as a factory/proxy method on the
manager, wouldn't a better approach be to make cached() a 'cache enabling'
clone method on QuerySet?
Er, yeah. That was kind of assumed. I'm still working from the
assumption that we want to make .objects a queryset.
i.e, cached() has the same operation as filter() -
it returns a clone, but with caching enabled. Example usage:
- Article.objects is the only way to get a clean query set, caching disabled
by default
- Article.objects.cached() takes the base query set, and returns a clone
with caching enabled
- Article.objects.filter(...) returns a filtered, uncached query,
- Article.objects.filter(...).cached() is the same query, but with a cache.
Yes, this is exactly what I intended to convey... sorry if I didn't make
myself clear...
- q = Article.objects.filter(...).cached(); q.cached() returns a second
version of the filter query, with a clean cache
Hm, not really sure if that is obvious. I was assuming cached would be
something like:
def cached(self):
if not self._cached:
other = self.clone()
other._cached = True
return other
else:
return self
Ie, it just returns itself if it will "do". Query sets can be passed
around, and the caller is intending to say "make sure this is cached".
For your intention, I imagine
p = q.clone()
p.reset_cache()
conveys the meaning a bit better. Does that make sense?
This also removes the need to proxy the QuerySet methods through the
manager. This really appeals to me, because the Article.objects.filter()
notation in the current implementation bugs me - to me, it should be
Article.objects.all().filter().
And then the .all() really starts to look meaningless.
It might also be a good idea to add a 'reset_cache()' method to allow
developers to reset the cache on a QuerySet (no-op if the query is not
cached), and maybe even a 'non_cached()' method to return a cloned QuerySet
with the cache disabled;
Yes, I think these are necessary.
Externalized caching is an interesting idea, but worries me slightly.
Caching is one of those areas where the exact behaviour that you want can be
highly application and situation dependent. I am more comfortable with
putting making the developer take responsibility for what they want to
cache, when they want to cache it, and the lifespan of that cache.
Particularly if we can make the default behaviour 'don't cache', with
caching being an opt-in behaviour (which the approach I described earlier on
this mail would do).
Also remember that related objects have a queryset attached to them - eg
my_article.reporters . The assumption so far has been that this will be
cached by default, ie you would use .non_cached() on them to get a
non-caching version. This does make it harder to understand whether
caching is on or not ( but also kind of makes sense as there are very
few situations you want this uncached.)
The only thing that was assumed to be uncached was the .objects as it is
in a class which sticks around.
So without externalising, the choices are:
a) inconsistency with managers and related objects having different
default caching behaviour.
b) the need to have a lot of .cached() being used on related objects,
when that is likely to be what people want in 95% of situations.
A global cache that automagically works out the most recent version of a
query has all sorts of potential for expectation to differ from
implementation, (i.e., developer writes their app, doesn't get the result
they expect because the caching implementation cleared a cached when they
were not expecting it to be, and complains "why doesn't it work").
Externalising caching means that consistent *and* non-ugly behaviour can
be offered for all "query set entry points" - managers and related objects.
The choice of which consistent behaviour is:
a) All caches last the http request. In a non http context, use these
boundary functions ( or a with statement in python 2.5) .
b) Caches last the length of a db transaction. In an http context, the
db transaction will last the length of an http request unless explicitly
changed using these view decorators.
In both:
Use .non_cached() to get a non caching query set. Use .reset_cache()
to reset the cache in a query set.
To boot,
it will be a little hairy to implement, there will be cries of 'coupling',
etc... it seems like asking for a lot of headaches without a whole lot of
benefit at the end of the day.
I think it is clearly a better solution, but I don't have the time to do
it myself. The first solution is better than the current situation though.