#33973: Performance regression when moving from 3.1 to 3.2
-------------------------------------+-------------------------------------
     Reporter:  Marc Parizeau        |                    Owner:  nobody
         Type:  Uncategorized        |                   Status:  new
    Component:  Uncategorized        |                  Version:  3.2
     Severity:  Normal               |               Resolution:
     Keywords:  performance          |             Triage Stage:
  regression                         |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------

Comment (by Marc Parizeau):

 After more digging, it appears that the two worst culprits in my query are
 the 1st and 3rd `Exists`. The 1st is about 10x slower on 3.2.15 than on
 3.1.14. The 3rd is about 6x slower on 3.2.15 than on 3.1.14. The execution
 times for the 2nd and 4th `Exists` are about the same for 3.2 and 3.1.

 If I use only the 1st `Exists` in my query, by commenting out the 3
 others, and by adding a False for the `Greatest` requirement of min 2
 args, I obtain de following explanations (I also removed the `order_by`
 and `select_related` clauses to simplify the query as much as possible):

 On Django 3.1.14:

 {{{
 time 0.03233041700000072
 Sort  (cost=25581.33..25581.33 rows=1 width=1207)
   Sort Key: content_course.uid, content_version.start DESC
   ->  Nested Loop Left Join  (cost=0.14..25581.32 rows=1 width=1207)
         Join Filter: (content_content.version_id = content_version.id)
         ->  Nested Loop Left Join  (cost=0.14..13437.94 rows=1 width=1202)
               Join Filter: (content_content.course_id = content_course.id)
               ->  Nested Loop  (cost=0.14..13436.58 rows=1 width=1124)
                     ->  Seq Scan on content_teacher  (cost=0.00..7.65
 rows=1 width=8)
                           Filter: (user_id = 59)
                     ->  Index Scan using content_semester_pkey on
 content_content  (cost=0.14..12150.00 rows=1 width=1124)
                           Index Cond: (id = content_teacher.content_id)
                           Filter: ((id <> 58) AND GREATEST((hashed SubPlan
 8), false))
                           SubPlan 8
                             ->  Seq Scan on forum_thread v0_1
 (cost=12114.13..13913.92 rows=14672 width=8)
                                   Filter: (NOT (hashed SubPlan 7))
                                   SubPlan 7
                                     ->  Bitmap Heap Scan on
 forum_threadentry u1_1  (cost=75.76..12097.65 rows=6592 width=8)
                                           Recheck Cond: (user_id = 59)
                                           Filter: (thread_id IS NOT NULL)
                                           ->  Bitmap Index Scan on
 forum_threadentry_user_id_585e5649  (cost=0.00..74.11 rows=6625 width=0)
                                                 Index Cond: (user_id = 59)
               ->  Seq Scan on content_course  (cost=0.00..1.16 rows=16
 width=86)
         ->  Seq Scan on content_version  (cost=0.00..1.24 rows=24
 width=12)
         SubPlan 2
           ->  Bitmap Heap Scan on forum_thread v0
 (cost=12139.37..13583.67 rows=587 width=0)
                 Recheck Cond: (content_id = content_content.id)
                 Filter: (NOT (hashed SubPlan 1))
                 ->  Bitmap Index Scan on
 forum_question_semester_id_e4f29334  (cost=0.00..25.09 rows=1174 width=0)
                       Index Cond: (content_id = content_content.id)
                 SubPlan 1
                   ->  Bitmap Heap Scan on forum_threadentry u1
 (cost=75.76..12097.65 rows=6592 width=8)
                         Recheck Cond: (user_id = 59)
                         Filter: (thread_id IS NOT NULL)
                         ->  Bitmap Index Scan on
 forum_threadentry_user_id_585e5649  (cost=0.00..74.11 rows=6625 width=0)
                               Index Cond: (user_id = 59)
 }}}

 On Django 3.2.15:

 {{{
 time 0.30839854199999195
 Sort  (cost=62.30..62.31 rows=1 width=1207)
   Sort Key: content_course.uid, content_version.start DESC
   ->  Nested Loop Left Join  (cost=0.41..62.29 rows=1 width=1207)
         ->  Nested Loop Left Join  (cost=0.28..40.58 rows=1 width=1202)
               ->  Nested Loop  (cost=0.14..39.68 rows=1 width=1124)
                     ->  Seq Scan on content_teacher  (cost=0.00..7.65
 rows=1 width=8)
                           Filter: (user_id = 59)
                     ->  Index Scan using content_semester_pkey on
 content_content  (cost=0.14..28.99 rows=1 width=1124)
                           Index Cond: (id = content_teacher.content_id)
                           Filter: ((id <> 58) AND GREATEST((SubPlan 3),
 false))
                           SubPlan 3
                             ->  Nested Loop Anti Join
 (cost=0.43..18547.19 rows=909 width=0)
                                   ->  Seq Scan on forum_thread v0_1
 (cost=0.00..1799.79 rows=1174 width=8)
                                         Filter: (content_id =
 content_content.id)
                                   ->  Index Only Scan using
 forum_threadentry_thread_id_user_id_cd21c4a5_uniq on forum_threadentry
 u1_1  (cost=0.43..55.66 rows=33 width=8)
                                         Index Cond: ((thread_id = v0_1.id)
 AND (user_id = 59))
               ->  Index Scan using content_course_pkey on content_course
 (cost=0.14..0.82 rows=1 width=86)
                     Index Cond: (id = content_content.course_id)
         ->  Index Scan using content_version_pkey on content_version
 (cost=0.14..0.82 rows=1 width=12)
               Index Cond: (id = content_content.version_id)
         SubPlan 1
           ->  Nested Loop Anti Join  (cost=0.43..18547.19 rows=909
 width=0)
                 ->  Seq Scan on forum_thread v0  (cost=0.00..1799.79
 rows=1174 width=8)
                       Filter: (content_id = content_content.id)
                 ->  Index Only Scan using
 forum_threadentry_thread_id_user_id_cd21c4a5_uniq on forum_threadentry u1
 (cost=0.43..55.66 rows=33 width=8)
                       Index Cond: ((thread_id = v0.id) AND (user_id = 59))
 }}}

 So it appears that the strategies of the two Django versions are indeed
 not the same. And it also appears that the regression stems from the
 implementation of `exclude(threadentry__user=user)` clause.

 Hope this info can help a gourou isolate the problem source (SQL is out of
 my league).

-- 
Ticket URL: <https://code.djangoproject.com/ticket/33973#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/01070182f9d02f75-387d7848-814e-48e9-9d73-0e7e1453965d-000000%40eu-central-1.amazonses.com.

Reply via email to