#34597: Django ORM query SQL generation changed (and is apparently a lot slower)
-------------------------------------+-------------------------------------
               Reporter:  Lorand     |          Owner:  nobody
  Varga                              |
                   Type:  Bug        |         Status:  new
              Component:  Database   |        Version:  4.2
  layer (models, ORM)                |
               Severity:  Release    |       Keywords:
  blocker                            |
           Triage Stage:             |      Has patch:  0
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 My app has 2 models (making things simple so debugging is easier):
 {{{
 class Blog(models.Model):
   title = models.CharField(max_length=60, blank=True)
   is_published = models.BooleanField(default=generate_random_bool,
 db_index=True)

 class Translation(models.Model):
   blog = models.ForeignKey('Blog', on_delete=models.CASCADE,
 related_name='translation')
 }}}

 There is a reverse foreign key relation between translation and Blog. Also
 a blog can have multiple translations (or none).

 I have this code block:
 {{{
 from django.db.models import Q
 qs = Blog.objects.filter(Q(is_published=True) & ~Q(translation=None))
 print(qs.query)
 }}}

 django 2.2 ORM generates this output:
 {{{
 SELECT
   "blog"."id",
   "blog"."title",
   "blog"."is_published"
 FROM
   "blog"
 WHERE
   (
     "blog"."is_published" = True
     AND NOT (
       "blog"."id" IN (
         SELECT
           U0."id"
         FROM
           "blog" U0
           LEFT OUTER JOIN "translation" U1 ON (U0."id" = U1."blog_id")
         WHERE
           (
             U1."id" IS NULL
             AND U0."id" = ("blog"."id")
           )
       )
     )
   )
 }}}

 For the same code block the django 3.2+ ORM (tried also the latest version
 of django 4.2.1) it generates:
 {{{
 SELECT
   "blog"."id",
   "blog"."title",
   "blog"."is_published"
 FROM
   "blog"
 WHERE
   (
     "blog"."is_published"
     AND NOT (
       EXISTS(
         SELECT
           (1) AS "a"
         FROM
           "blog" U0
           LEFT OUTER JOIN "translation" U1 ON (U0."id" = U1."blog_id")
         WHERE
           (
             U1."id" IS NULL
             AND U0."id" = "blog"."id"
             AND "blog"."id" = "blog"."id"
           )
         LIMIT
           1
       )
     )
   )
 }}}
 which is a whole lot slower (at least on the same postgresql 13.7
 instance).
 The django 2.2 version executes in a couple of seconds while the newer
 version executes in half an hour.

 I think (but am not completely sure, might be terribly wrong) that the
 issue was introduced here:
 https://github.com/django/django/pull/13300/files

 Environment:
 asgiref==3.7.1
 backports.zoneinfo==0.2.1
 Django==3.2.19
 model-bakery==1.11.0
 psycopg2-binary==2.8.6
 pytz==2023.3
 sqlparse==0.4.4
 typing_extensions==4.6.1

 Db environment:
 PostgreSQL 13.7 on aarch64-unknown-linux-gnu, compiled by gcc (GCC) 7.3.1
 20180712 (Red Hat 7.3.1-6), 64-bit

 Bug does not manifest in django 2.2.12.
 Bug manifests in django 3.2 and django 4.2.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/34597>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/01070188537179c3-15612b98-395c-4d46-814a-db7b517e619c-000000%40eu-central-1.amazonses.com.

Reply via email to