Re: Bulk Delete - Take 3, descriptor style

Russell Keith-Magee Mon, 13 Feb 2006 05:33:31 -0800

On 2/13/06, Robert Wittams <[EMAIL PROTECTED]> wrote:

> Yes, this is exactly what I intended to convey... sorry if I didn't make
> myself clear...


It might take me few attempts to get the point, but as long as we all
end up on the same page... :-)

> > - q = Article.objects.filter(...).cached(); q.cached() returns a second
> > version of the filter query, with a clean cache
>
> Hm, not really sure if that is obvious. I was assuming cached would be
> something like:
...
> Ie, it just returns itself if it will "do". Query sets can be passed
> around, and the caller is intending to say "make sure this is cached".

I was assuming that cached() would _always_ clone. If the source
QuerySet is non-cached, you get a clone with caching enabled; if the
source is cached, you get a new QuerySet with a clean cache. The use
case I can see is:

p = Article.objects.filter(...) # Original, uncached query
q = p.cached() # Copied, cache version 1
for obj in q: ... # evaluate cache 1
#Add an object that would match q
r = q.cached() # Copy of cached query, cache version 2
for obj in r: ... # evaluate cache 2; new object is in list
for obj in q: ... # Iterate over cache 1; new object not in list

This way, the cache becomes a store of what was in the database when
the query was executed, rather than there being a unique cache for any
given query. More on this later...

> And then the .all() really starts to look meaningless.

Yup. Hey, this understanding each other thing is fun! :-)

> So without externalising, the choices are:
>
> a) inconsistency with managers and related objects having different
> default caching behaviour.
> b) the need to have a lot of .cached() being used on related objects,
> when that is likely to be what people want in 95% of situations.

I would tend to go with (b), with a chorus of 'fix this in the
documentation'. i.e., make the documentation very clear about the fact
that a cache is available, and might be a good way to optimize
performance in some cases.

Also - 95%? Really? The perfomance hit of non-caching only matters if
you iterate over any given QuerySet more than once per http request.
Maybe I'm being unimaginative, but thinking over my common use cases,
multiple iterations over a QuerySet per http request would be the
exception, rather than the rule (or at the very least, nowhere near
95% of all use cases).

There is also an option (c): make the manager a cached query, and
document the need to use non_cached(). If multiple iterations over a
query set really is the 95% use case, it would make sense to me to
cache across the board.

> Externalising caching means that consistent *and* non-ugly behaviour can
> be offered for all "query set entry points" - managers and related objects.

Well - for your definition of consistent and non-ugly, anyway :-)

I think we may have different ideas on the cardinality of the
query-cache releationship. Consider:

p = Article.objects.filter(headline='xyz').cached()
q = Article.objects.filter(headline='xyz').cached()

If I am understanding your caching model correctly, you would ideally
like to see p and q using the same cache. This model is almost
impossible to acheive without externalization; the mapping of this
caching model onto the existing framework (where p and q have
different caches unless q is a clone of p) is very much ugly. Ergo,
pro externalization.

My problem with this model is that p.reset_cache() would clear q's
cache, too. On top of that, there is the question of when the system
automagically flushes the cache (per http request and per transaction
being two reasonable suggestions).

I would argue that p and q should have independent caches. This way,
once you have iterated over p, you know exactly what is there; if you
iterate over p again, you will get always get exactly the same result,
regardless of what you have done to q. This makes a populated cache a
snapshot of the state of the database at a given time. No need to
worry if someone/something else has changed the database - my snapshot
is always the same.

It's a slightly different model of caching, but it is a consistent and
easily explainable model, IMHO non-ugly, and doesn't require
externalization to achieve - it is entirely covered by the existing
framework, plus the modifications we have been discussing.

Russ Magee %-)

Re: Bulk Delete - Take 3, descriptor style

Reply via email to