#36552: Repeating QuerySet.filter() generates unwanted LEFT OUTER JOINs
-------------------------------------+-------------------------------------
     Reporter:  Márton Salomváry     |                    Owner:  (none)
         Type:                       |                   Status:  closed
  Cleanup/optimization               |
    Component:  Database layer       |                  Version:  5.0
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:  invalid
     Keywords:  chained-filters      |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Comment (by Márton Salomváry):

 Replying to [comment:5 Natalia Bidart]:
 > The thing is that the commit you reference fixes a bug.

 I understand that. What I can't decide is whether what I am seeing is
 another bug introduced by that bugfix, or I have been successfully relying
 on the existence of bug for quite long :)

 >
 > This is documented in this
 [https://docs.djangoproject.com/en/5.2/topics/db/queries/#spanning-multi-
 valued-relationships section].
 >
 > > I would appreciate feedback on how the original query does **not**
 align with Django's ORM patterns. The documentation does not discourage
 repeat-calls to `filter()` in order to narrow down results.
 >
 > Chaining `.filter()` is a normal pattern, but when multi-valued
 relationships are involved the docs note that it can change semantics and
 have query/performance implications. If `JOIN` reuse is necessary due to
 performance requirements, combining conditions in a single call is the ORM
 pattern to follow.

 Oh thanks, now I remember having been bitten by this in the past, but not
 often and painfully enough to remember the peculiarities of joins with
 multi-valued relationships!

 >
 > > In any case, I will try to provide more input on how this breaks the
 **results** of our queries.
 >
 > That would certainly help, thank you!

 In this example, count is wrongly `2` in Django 5.0 and correctly `1` in
 Django 4.x:

 https://github.com/salomvary/django-5-filter-
 regression/commit/f3cabd4e8987210e9aa17c99bfa622cfc7c580d8

 {{{
 alpha = Alpha.objects.create(id=7777)
 bravo = alpha.bravos.create()
 charlie1 = Charlie.objects.create(id=9999)
 charlie2 = Charlie.objects.create(id=5555)
 CharlieBravo.objects.create(bravo=bravo, charlie=charlie1)
 CharlieBravo.objects.create(bravo=bravo, charlie=charlie2)
 bravo.deltas.create()

 queryset = (
     Alpha.objects.filter(
         bravos__charlie_bravos__charlie=9999,
     )
     .filter(
         Exists(
             Delta.objects.filter(
                 bravos__id=OuterRef("bravos__charlie_bravos__bravo_id"),
             )
         )
     )
     .annotate(count=Count(
         "bravos__charlie_bravos__bravo"
     ))
     .values_list("pk", "count")
 )

 self.assertEqual(
     [(7777, 1)],
     list(queryset)
 )
 }}}

 Now I wonder how to spot these in an existing code base, and prevent
 shooting myself in the foot in the future. I guess I can monkey-patch
 QuerySets to log a warning if repeated `filter()` is used with multi-
 valued relationships and force using a single `filter()` and combining
 conditions with `&` or `|`.
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36552#comment:6>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/01070198be4fe44c-f819a80b-c9e5-4e1e-95ce-ef4d581f0d7b-000000%40eu-central-1.amazonses.com.

Reply via email to