Re: Poor Performance running Django unit tests after upgrading from 10.6
On 10/15/20 1:21 AM, Roger Hunwicks wrote: > > I think we have narrowed down the problem to a single, very complex, > materialized view using CTEs; the unit tests create the test data and > then refresh the materialized view before executing the actual test > code. > Have you checked to see if the CTE query is affected by the change to how CTEs are run in release 12? The release notes say: Allow common table expressions (CTEs) to be inlined into the outer query (Andreas Karlsson, Andrew Gierth, David Fetter, Tom Lane) Specifically, CTEs are automatically inlined if they have no side-effects, are not recursive, and are referenced only once in the query. Inlining can be prevented by specifying MATERIALIZED, or forced for multiply-referenced CTEs by specifying NOT MATERIALIZED. Previously, CTEs were never inlined and were always evaluated before the rest of the query. So if you haven't already, start by putting MATERIALIZED before each CTE clause: with foo as MATERIALIZED (select ...), bar as MATERIALIZED (select ...), ... and see if that changes anything. cheers andrew -- Andrew Dunstan EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Slow Query
Hi all, Thanks, Michael (and Martin other thread)! We added those indexes you suggested, and went ahead and added indexes for all our foreign keys. We also added one combination index on notification (user, time). It led to a small constant factor speed up (2x) but is still taking a 13+ seconds. :( Still seems aggressively bad. I've attached the updated, cleaned up query and explain analyze result (the extra chaos was due to the fact that we're using https://hackage.haskell.org/package/esqueleto-3.2.3/docs/Database-Esqueleto.html to generate the SQL). Maybe we're missing some multi-column indexes? Best, Parth On Wed, Oct 14, 2020 at 3:18 PM Michael Lewis wrote: > Is there no index on thread.spool? What about notification.user? How about > message.time (without thread as a leading column). Those would all seem > very significant. Your row counts are very low to have a query perform so > badly. Work_mem could probably be increased above 4MB, but it isn't hurting > this query in particular. > > My primary concern is that the query is rather chaotic at a glance. It > would be great to re-write and remove the unneeded keywords, double quotes, > totally worthless parentheses, etc. Something like the below may help you > see the crux of the query and what could be done and understand how many > rows might be coming out of those subqueries. I re-ordered some joins and > there might be syntax errors, but give it a shot once you've added the > indexes suggested above. > > SELECT > > spool.id, > > handle.handle, > > spool.name, > > thread.id, > > case.closed, > > notification.read, > > notification2.time, > > message.message, > > message.time, > > message.author, > > thread.name, > > location.geo > > FROM > > spool > > JOIN handle ON handle.id = spool.id > > JOIN thread ON thread.spool = spool.id > > JOIN message ON message.thread = thread.id > > LEFT JOIN location ON location.id = spool.location > > LEFT JOIN case ON case.id = spool.id > > LEFT JOIN notification ON notification.user = > 'b16690e4-a3c5-4868-945e-c2458c27a525' > > AND > > notification.id = ( > > SELECT > > notification3.id > > FROM > > notification AS notification3 > > JOIN notification_thread ON notification_thread.id = notification3.id > > JOIN thread AS thread2 ON thread2.id = notification_thread.thread > > WHERE > > thread2.spool = spool.id > > AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525' > > AND notification3.time <= '2020-09-30 16:32:38.054558' > > ORDER BY > > notification3.time DESC > > LIMIT 1 > > ) > > LEFT JOIN notification AS notification2 ON notification2.user = > 'b16690e4-a3c5-4868-945e-c2458c27a525' > > AND notification2.id = ( > > SELECT > > notification3.id > > FROM > > notification AS notification3 > > JOIN notification_thread ON notification_thread.id = notification3.id > > JOIN thread AS thread2 ON thread2.id = notification_thread.thread > > WHERE > > thread2.spool = spool.id > > AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525' > > AND notification3.time > '2020-09-30 16:32:38.054558' > > ORDER BY > > notification3.time DESC > > LIMIT 1 > > ) > > WHERE > > message.time = ( > > SELECT > > MAX ( message2.time ) > > FROM > > message AS message2 > > JOIN thread AS thread2 ON thread2.id = message2.thread > > JOIN participant ON participant.thread = thread2.id > > JOIN identity ON identity.id = participant.identity > > LEFT JOIN relation ON relation.to = identity.id > > AND relation.from = 'b16690e4-a3c5-4868-945e-c2458c27a525' > > AND relation.manages = TRUE > > WHERE > > NOT message2.draft > > AND ( identity.id = 'b16690e4-a3c5-4868-945e-c2458c27a525' OR NOT > relation.to IS NULL ) > > AND thread2.spool = spool.id > > LIMIT 1 > > ) > > AND notification.id IS NOT NULL > > ORDER BY > > message.time DESC > > LIMIT 31; > inbox.sql Description: Binary data explain.json Description: application/json
Re: Poor Performance running Django unit tests after upgrading from 10.6
Roger Hunwicks writes: > ... > However, even though we have similar performance for 12.4 for most > test runs, it remains very variable. > ... > I think we have narrowed down the problem to a single, very complex, > materialized view using CTEs; the unit tests create the test data and > then refresh the materialized view before executing the actual test > code. In addition to others' nearby comments, I'd suggest that running all this under auto_explain would be informative. You evidently are not getting a stable plan for your troublesome query, so you need to see what the range of plans is, not just probe it once with a manual EXPLAIN. regards, tom lane
CPU Consuming query. Sequential scan despite indexing.
Hi,
Below query always shows up on top in the CPU matrix. Also despite having
indexes it does sequential scans(probably because WHERE condition satisfies
almost all of the data from table). This query runs on the default landing
page in application and needs to fetch records in less that 100 ms without
consuming too much CPU.
Any opinions? Table is very huge and due to referential identity and
business requirements we could not implement partitioning as well.
There is index on (countrycode,facilitycode,jobstartdatetime)
explain (analyze,buffers) with JobCount as ( select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH' and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and ((j.jobstartdatetime between '2020-08-01 00:00:00' and '2020-09-30
00:00:00' ) or j.jobstartdatetime IS NULL ) group by j.jobstatuscode)
select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc
right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;
QUERY PLAN
---
Hash Right Join (cost=98845.93..98846.10 rows=10 width=12) (actual
time=1314.809..1314.849 rows=10 loops=1)
Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
Buffers: shared hit=21314 read=3231
I/O Timings: read=19.867
CTE jobcount
-> Finalize GroupAggregate (cost=98842.93..98844.71 rows=7 width=12)
(actual time=1314.780..1314.802 rows=6 loops=1)
Group Key: j.jobstatuscode
Buffers: shared hit=21313 read=3231
I/O Timings: read=19.867
-> Gather Merge (cost=98842.93..98844.57 rows=14 width=12)
(actual time=1314.766..1314.857 rows=18 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=60102 read=11834
I/O Timings: read=59.194
-> Sort (cost=97842.91..97842.93 rows=7 width=12)
(actual time=1305.044..1305.047 rows=6 loops=3)
Sort Key: j.jobstatuscode
Sort Method: quicksort Memory: 25kB
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
Buffers: shared hit=60102 read=11834
I/O Timings: read=59.194
-> Partial HashAggregate (cost=97842.74..97842.81
rows=7 width=12) (actual time=1305.010..1305.013 rows=6 loops=3)
Group Key: j.jobstatuscode
Buffers: shared hit=60086 read=11834
I/O Timings: read=59.194
-> Parallel Seq Scan on job j
(cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
rows=163200 loops=3)
Filter: (((countrycode)::text =
'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
without time zone) AND (jobst
artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
(jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
Rows Removed by Filter: 449035
Buffers: shared hit=60086 read=11834
I/O Timings: read=59.194
-> CTE Scan on jobcount jc (cost=0.00..0.14 rows=7 width=24) (actual
time=1314.784..1314.811 rows=6 loops=1)
Buffers: shared hit=21313 read=3231
I/O Timings: read=19.867
-> Hash (cost=1.10..1.10 rows=10 width=4) (actual time=0.014..0.015
rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=1
-> Seq Scan on jobstatus js (cost=0.00..1.10 rows=10 width=4)
(actual time=0.005..0.008 rows=10 loops=1)
Buffers: shared hit=1
Planning Time: 0.949 ms
Execution Time: 1314.993 ms
(40 rows)
Regards,
Aditya.
Re: CPU Consuming query. Sequential scan despite indexing.
What version by the way? Do you get a faster execution if you disable sequential scan? Or set parallel workers per gather to 0? Your estimates look decent as do cache hits, so other than caching data or upgrading hardware, not sure what else there is to be done. Although... you are hitting 70k blocks to read only 612k rows? Are these job records very wide perhaps, or do you need to do some vacuuming? Perhaps autovacuum is not keeping up and you could use some repacking or vacuum full if/when you can afford downtime. If you create a temp table copy of the job table, how does the size compare to the live table?
