I'm trying to inject a table into my query, but need to do an 
INNER JOIN with a custom ON clause.  This is due to the volumes 
of data in the joined table.  Without being able to do something like


        ... INNER JOIN tblBig b ON
        (b.other_id = other.id AND
        b.foo = 42 AND
        b.bar = 'zip'
        )

the cartesian product makes the join huge.  I've performance 
tested the resulting queries at the psql prompt, and the above 
version runs in under 2 seconds.  Doing the logical equiv of


        ..., tblBig b
        WHERE
        b.other_id = other.id AND
        b.foo = 42 AND
        b.bar = 'zip'

takes upwards of 10-15 minutes.  My figuring is that it's doing a 
full cartesian product of the two tables (producing billions of 
rows) and then filtering that result.  By filtering in the ON 
clause, I can reduce the impact of having 600,000+ rows in tblBig 
because it only does the product of the outside stuff with the 
germane data.

I'm currently mucking under the hood with a custom object that 
toys with the query, but am fighting with it to a degree that 
someone else may have already done this or would have hints on 
best practices.

Thanks,

-tkc




--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to