#34728: OR operator on queryset does not work as expected
-------------------------------------+-------------------------------------
     Reporter:  Kbleser              |                    Owner:  nobody
         Type:  Bug                  |                   Status:  closed
    Component:  Database layer       |                  Version:  4.2
  (models, ORM)                      |
     Severity:  Normal               |               Resolution:  needsinfo
     Keywords:  queryset OR          |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

 TL;DR not an issue with queryset combination but another ticket that
 relates to how the ORM deals with aggregation when more than one multi-
 valued relationship is involved. Either ''invalid'' or ''duplicate'' to
 me.

 ---

 > As you can see on the main branch of the repository linked above by the
 reporter, the query works as expected if it is rewritten to use Q objects

 This happens to work because you perform the aggregation ''before''
 filtering the multi-valued relationship which prevents JOIN reuse
 [https://docs.djangoproject.com/en/4.2/topics/db/aggregation/#order-of-
 annotate-and-filter-clauses as documented] and results in you joining
 twice against against the same multi-valued relationship

 {{{#!sql
 SELECT "recipes_recipe"."id",
        "recipes_recipe"."title",
        "recipes_recipe"."date_posted",
        "recipes_recipe"."created_by_id",
        "recipes_recipe"."image",
        "recipes_recipe"."preparation_time_in_minutes",
        "recipes_recipe"."instructions",
        "recipes_recipe"."level",
        "recipes_recipe"."serving_size",
        COUNT("recipes_ingredienttorecipe"."ingredient_id_id") AS
 "shared_ingredients"
 FROM "recipes_recipe"
 LEFT OUTER JOIN "recipes_ingredienttorecipe" ON ("recipes_recipe"."id" =
 "recipes_ingredienttorecipe"."recipe_id_id")
 INNER JOIN "recipes_ingredienttorecipe" T4 ON ("recipes_recipe"."id" =
 T4."recipe_id_id")
 LEFT OUTER JOIN "recipes_tagtorecipe" ON ("recipes_recipe"."id" =
 "recipes_tagtorecipe"."recipe_id_id")
 WHERE (T4."ingredient_id_id" IN
          (SELECT U0."id"
           FROM "recipes_ingredient" U0
           INNER JOIN "recipes_ingredienttorecipe" U1 ON (U0."id" =
 U1."ingredient_id_id")
           WHERE U1."recipe_id_id" = 1)
        AND NOT ("recipes_recipe"."id" = 1))
 GROUP BY "recipes_recipe"."id",
          "recipes_recipe"."title",
          "recipes_recipe"."date_posted",
          "recipes_recipe"."created_by_id",
          "recipes_recipe"."image",
          "recipes_recipe"."preparation_time_in_minutes",
          "recipes_recipe"."instructions",
          "recipes_recipe"."level",
          "recipes_recipe"."serving_size",
          "recipes_tagtorecipe"."tag_id_id"
 HAVING (("recipes_tagtorecipe"."tag_id_id" IN
            (SELECT U0."id"
             FROM "recipes_tag" U0
             INNER JOIN "recipes_tagtorecipe" U1 ON (U0."id" =
 U1."tag_id_id")
             WHERE U1."recipe_id_id" = 1)
          AND COUNT("recipes_ingredienttorecipe"."ingredient_id_id") >= 2)
         OR COUNT("recipes_ingredienttorecipe"."ingredient_id_id") >= 4);
 }}}

 Notice how `recipes_ingredienttorecipe` is joined twice which means
 `COUNT("recipes_ingredienttorecipe"."ingredient_id_id") AS
 "shared_ingredients"` will actually be the `COUNT` of the
 [http://charettes.name/djangoconus2022/slides.html#21 product of rows]
 `2X2=4` (see #10060). In other words, aggregate annotations when many
 multi-valued relationships are involved are broken and have been for while
 and in your case it's the
 `COUNT("recipes_ingredienttorecipe"."ingredient_id_id") >= 4)` that is
 allowing your test to pass on `main`. The fact this problems manifests
 itself in queryset combination when filtering against broken aggregate
 annotation is just a symptom of a much larger problem.

 If you define one of your aggregate annotations as a subquery to avoid
 joining more than one multi-valued relationship at once (#28296) things
 properly work

 {{{#!python
 def find_similar_recipes(self):
     return (
         Recipe.objects.annotate(
             shared_ingredients_cnt=Count(
                 "ingredients",
                 filter=Q(ingredients__in=self.ingredients.all()),
             ),
             # Using a subquery to target the _other_ multi-valued
 relationship avoids X product of rows
             # with ingredients
             shares_tag=Exists(
                 TagToRecipe.objects.filter(
                     tag_id__in=self.tags.all(),
                     recipe_id=OuterRef("pk"),
                 )
             ),
         )
         .filter(
             Q(shared_ingredients_cnt__gte=4)
             | Q(shares_tag=True, shared_ingredients_cnt__gte=2)
         )
         .exclude(id=self.id)
     )
 }}}

 Which can also use queryset combination without issues

 {{{#!python
 def find_similar_recipes(self):
     base = Recipe.objects.annotate(
         shared_ingredients_cnt=Count(
             "ingredients",
             filter=Q(ingredients__in=self.ingredients.all()),
         )
     ).exclude(id=self.id)
     same_tag_and_at_least_two_shared_ingredients = base.annotate(
         shares_tag=Exists(
             TagToRecipe.objects.filter(
                 tag_id__in=self.tags.all(),
                 recipe_id=OuterRef("pk"),
             )
         )
     ).filter(Q(shares_tag=True, shared_ingredients_cnt__gte=2))
     at_least_four_shared_ingredients =
 base.filter(shared_ingredients_cnt__gte=4)
     return (
         same_tag_and_at_least_two_shared_ingredients |
 at_least_four_shared_ingredients
     )
 }}}

-- 
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:14>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/010701899a2003f3-54853793-f8ac-4829-8740-a5bba867f89d-000000%40eu-central-1.amazonses.com.

Reply via email to