#33973: Performance regression when moving from 3.1 to 3.2
-------------------------------------+-------------------------------------
Reporter: Marc Parizeau | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 3.2
Severity: Normal | Resolution:
Keywords: performance | Triage Stage:
regression | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Marc Parizeau):
After more digging, it appears that the two worst culprits in my query are
the 1st and 3rd `Exists`. The 1st is about 10x slower on 3.2.15 than on
3.1.14. The 3rd is about 6x slower on 3.2.15 than on 3.1.14. The execution
times for the 2nd and 4th `Exists` are about the same for 3.2 and 3.1.
If I use only the 1st `Exists` in my query, by commenting out the 3
others, and by adding a False for the `Greatest` requirement of min 2
args, I obtain de following explanations (I also removed the `order_by`
and `select_related` clauses to simplify the query as much as possible):
On Django 3.1.14:
{{{
time 0.03233041700000072
Sort (cost=25581.33..25581.33 rows=1 width=1207)
Sort Key: content_course.uid, content_version.start DESC
-> Nested Loop Left Join (cost=0.14..25581.32 rows=1 width=1207)
Join Filter: (content_content.version_id = content_version.id)
-> Nested Loop Left Join (cost=0.14..13437.94 rows=1 width=1202)
Join Filter: (content_content.course_id = content_course.id)
-> Nested Loop (cost=0.14..13436.58 rows=1 width=1124)
-> Seq Scan on content_teacher (cost=0.00..7.65
rows=1 width=8)
Filter: (user_id = 59)
-> Index Scan using content_semester_pkey on
content_content (cost=0.14..12150.00 rows=1 width=1124)
Index Cond: (id = content_teacher.content_id)
Filter: ((id <> 58) AND GREATEST((hashed SubPlan
8), false))
SubPlan 8
-> Seq Scan on forum_thread v0_1
(cost=12114.13..13913.92 rows=14672 width=8)
Filter: (NOT (hashed SubPlan 7))
SubPlan 7
-> Bitmap Heap Scan on
forum_threadentry u1_1 (cost=75.76..12097.65 rows=6592 width=8)
Recheck Cond: (user_id = 59)
Filter: (thread_id IS NOT NULL)
-> Bitmap Index Scan on
forum_threadentry_user_id_585e5649 (cost=0.00..74.11 rows=6625 width=0)
Index Cond: (user_id = 59)
-> Seq Scan on content_course (cost=0.00..1.16 rows=16
width=86)
-> Seq Scan on content_version (cost=0.00..1.24 rows=24
width=12)
SubPlan 2
-> Bitmap Heap Scan on forum_thread v0
(cost=12139.37..13583.67 rows=587 width=0)
Recheck Cond: (content_id = content_content.id)
Filter: (NOT (hashed SubPlan 1))
-> Bitmap Index Scan on
forum_question_semester_id_e4f29334 (cost=0.00..25.09 rows=1174 width=0)
Index Cond: (content_id = content_content.id)
SubPlan 1
-> Bitmap Heap Scan on forum_threadentry u1
(cost=75.76..12097.65 rows=6592 width=8)
Recheck Cond: (user_id = 59)
Filter: (thread_id IS NOT NULL)
-> Bitmap Index Scan on
forum_threadentry_user_id_585e5649 (cost=0.00..74.11 rows=6625 width=0)
Index Cond: (user_id = 59)
}}}
On Django 3.2.15:
{{{
time 0.30839854199999195
Sort (cost=62.30..62.31 rows=1 width=1207)
Sort Key: content_course.uid, content_version.start DESC
-> Nested Loop Left Join (cost=0.41..62.29 rows=1 width=1207)
-> Nested Loop Left Join (cost=0.28..40.58 rows=1 width=1202)
-> Nested Loop (cost=0.14..39.68 rows=1 width=1124)
-> Seq Scan on content_teacher (cost=0.00..7.65
rows=1 width=8)
Filter: (user_id = 59)
-> Index Scan using content_semester_pkey on
content_content (cost=0.14..28.99 rows=1 width=1124)
Index Cond: (id = content_teacher.content_id)
Filter: ((id <> 58) AND GREATEST((SubPlan 3),
false))
SubPlan 3
-> Nested Loop Anti Join
(cost=0.43..18547.19 rows=909 width=0)
-> Seq Scan on forum_thread v0_1
(cost=0.00..1799.79 rows=1174 width=8)
Filter: (content_id =
content_content.id)
-> Index Only Scan using
forum_threadentry_thread_id_user_id_cd21c4a5_uniq on forum_threadentry
u1_1 (cost=0.43..55.66 rows=33 width=8)
Index Cond: ((thread_id = v0_1.id)
AND (user_id = 59))
-> Index Scan using content_course_pkey on content_course
(cost=0.14..0.82 rows=1 width=86)
Index Cond: (id = content_content.course_id)
-> Index Scan using content_version_pkey on content_version
(cost=0.14..0.82 rows=1 width=12)
Index Cond: (id = content_content.version_id)
SubPlan 1
-> Nested Loop Anti Join (cost=0.43..18547.19 rows=909
width=0)
-> Seq Scan on forum_thread v0 (cost=0.00..1799.79
rows=1174 width=8)
Filter: (content_id = content_content.id)
-> Index Only Scan using
forum_threadentry_thread_id_user_id_cd21c4a5_uniq on forum_threadentry u1
(cost=0.43..55.66 rows=33 width=8)
Index Cond: ((thread_id = v0.id) AND (user_id = 59))
}}}
So it appears that the strategies of the two Django versions are indeed
not the same. And it also appears that the regression stems from the
implementation of `exclude(threadentry__user=user)` clause.
Hope this info can help a gourou isolate the problem source (SQL is out of
my league).
--
Ticket URL: <https://code.djangoproject.com/ticket/33973#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/01070182f9d02f75-387d7848-814e-48e9-9d73-0e7e1453965d-000000%40eu-central-1.amazonses.com.