I'm trying to inject a table into my query, but need to do an
INNER JOIN with a custom ON clause. This is due to the volumes
of data in the joined table. Without being able to do something like
... INNER JOIN tblBig b ON
(b.other_id = other.id AND
b.foo = 42 AND
b.bar = 'zip'
)
the cartesian product makes the join huge. I've performance
tested the resulting queries at the psql prompt, and the above
version runs in under 2 seconds. Doing the logical equiv of
..., tblBig b
WHERE
b.other_id = other.id AND
b.foo = 42 AND
b.bar = 'zip'
takes upwards of 10-15 minutes. My figuring is that it's doing a
full cartesian product of the two tables (producing billions of
rows) and then filtering that result. By filtering in the ON
clause, I can reduce the impact of having 600,000+ rows in tblBig
because it only does the product of the outside stuff with the
germane data.
I'm currently mucking under the hood with a custom object that
toys with the query, but am fighting with it to a degree that
someone else may have already done this or would have hints on
best practices.
Thanks,
-tkc
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Django users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---