Re: Aggregates and derived tables

Nicolas Lara Tue, 27 May 2008 09:46:45 -0700

Hi Karen,

On Mon, May 26, 2008 at 7:24 PM, Karen Tracey <[EMAIL PROTECTED]> wrote:
> (The other aggregate thread has gotten a bit long and hard to find things in
> so I've started a new thread for this question.)
>


no prob, =)

> Moving on to another of my crossword DB stats pages, what I display is a
> histogram of entries by frequency.  So I can see 45% of the entries in the
> database appear only once, 14% twice, 7% three times, etc. with a huge long
> thin tail of tiny numbers of entries that appear a great number of times.
> (I can also filter by entry length so as to see how the distribution changes
> for entries of different lengths, but I'll leave that for later.)  The SQL I
> use to generate this data uses a derived table (not sure if this qualifies
> as a subquery or just a derived table?).  It is:
>
> select Appearances, Count(*) from (select Count(*) as Appearances from Clues
> group by `Entry ID`) as T1 Group by Appearances with rollup;
>

I would call it a subquery, but don't know what database-theory says
about it  =p

> I'm at a loss for how to to translate this to aggregate/annotate.  It's easy
> enough to do the inner query, something like:
>
> Clues.objects.values('EntryID__EntryID').annotate(Appearances=Count('Entry
> ID'))
>
> but then I want to take the result of that and group by Appearances and
> annotate again with another count, and I want the rollup value so I can
> easily calculate percentages from the count.  I don't know how to do either
> of those things (though the rollup would be easy enough and probably not too
> expensive to do in Python after the fact).
>

There have been some discussion about this on "the other thread". you
can check my last message [1] and Russell's reply to it. (and the
reply i'll make later)

> Is this beyond the scope of what aggregate/annotate can do or am I just
> missing the right way to do it?

By now it is neither in or out of the scope. There just hasn't been
enough discussion to come up with a syntax that would fit the orm well
and provide clean solutions to the problem.
I believe there are three possibilities, one is to always go explicit
and require the user to specify the subquery when she intends to make
it, the other is to allow the subqueries to occur only in some special
cases without the user knowing there is a subquery involved, and the
third is having a bit of both: allowing some syntax to specify
subqueries but also automatically making the subqueries when the use
case requires it.

In aggregates this use case would be something like:
Model.object.annotate(s=Something(...)) .aggregate(SomethingElse(...)
) or, for that matter,  Model.object.annotate(s=Something(...))
.aggregate(SomethingElse('s') )

>
> (Are you beginning to be sorry I started playing with this?)
>

Not at all! your comments are very useful and well founded! (and with
good explanations). But be paitient, maybe in some months I will be =)

Best regards,


Nicolas

[1] http://groups.google.com/group/django-developers/msg/81ae8754c26a0cc1

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Aggregates and derived tables

Reply via email to