Query Performance
New to Postgres, Oracle background. With Oracle the amount of work a query
does is tracked via logical reads. Oracle tracks logical and physical reads
differently than Postgres. With Oracle a physical read is always considered a
logical read. So if a query reads 5 blocks are all 5 are read from disk the
query would do 5 logical reads, 5 physical reads. It appears with Postgres
Buffers shared hit are reads from memory and Buffer shared read is off disk.
To get total reads one would need to add up shared hits + shared reads.
I have a sample query that is doing more work if some of the reads are physical
reads and I'm trying to understand why. If you look at attached
QueryWithPhyReads.txt it shows the query did Buffers: shared hit=171 read=880.
So it did 171 + 880 = 1051 total block reads (some logical, some physical).
QueryWithNoPhyReads.txt shows execution statistics of the execution of the
exact same query with same data point. The only difference is the first
execution loaded blocks into memory so this execution had all shared hits. In
this case the query did this much work: Buffers: shared hit=581.
With Oracle that would not happen. If the 2nd execution of the query did all
reads from memory the shared hits would be 1051, not 581.
So it appears to me that with Postgres when a query does physical reads it not
only has the expense of doing those disk reads but there is also extra work
done to increase overall block reads for a query. But I don't understand why
that would be the case. Could someone explain why this is happening?
Thanks
This e-mail is for the sole use of the intended recipient and contains
information that may be privileged and/or confidential. If you are not an
intended recipient, please notify the sender by return e-mail and delete this
e-mail and any attachments. Certain required legal entity disclosures can be
accessed on our website:
https://www.thomsonreuters.com/en/resources/disclosures.html
explain (analyze, verbose, costs, settings, buffers, timing, summary, hashes)
(/*+ IndexScan(historyeve0_ history_event_idx02) */ select distinct
historyeve0_.EVENT_TYPE as junk123
from HISTORY_EVENT historyeve0_
where historyeve0_.PRISM_GUID='i0acc051c0133f57f4be2bbbdc5ef'
and historyeve0_.IS_DELETED=0
and (historyeve0_.EVENT_TYPE not in
('versionsSearchWithinQueryEvent','notesOfDecisionsSearchWithinQueryEvent','citingReferencesSearchWithinQueryEvent','tocSearchWithinQueryEvent','searchWithinDocumentEvent'
))
and (historyeve0_.PRODUCT_VIEW in
('DefaultProductView','TNPPlus','PLCUS','Indigo','INDIGOCA' )
or historyeve0_.PRODUCT_VIEW is null)
and historyeve0_.PRODUCT_SID='WestlawNext'
order by historyeve0_.EVENT_TYPE asc);
Sort (cost=12350.30..12350.80 rows=200 width=13) (actual time=11.686..11.693
rows=6 loops=1)
Output: historyeve0_.event_type
Sort Key: historyeve0_.event_type
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=581
-> HashAggregate (cost=12340.65..12342.65 rows=200 width=13) (actual
time=11.676..11.683 rows=6 loops=1)
Output: historyeve0_.event_type
Group Key: historyeve0_.event_type
Buffers: shared hit=581
-> Append (cost=0.55..12334.42 rows=2492 width=13) (actual
time=0.035..10.759 rows=5566 loops=1)
Buffers: shared hit=581
-> Index Scan using
history_event_display_timesta_prism_guid_display_timestamp__idx on
hist2.history_event_display_timestamp_20200601 historyeve0_
(cost=0.55..1758.76 rows=362 width=13) (actual time=0.035..1.857 rows=1020
loops=1)
Output: historyeve0_.event_type
Index Cond: (((historyeve0_.prism_guid)::text =
'i0acc051c0133f57f4be2bbbdc5ef'::text) AND
((historyeve0_.product_sid)::text = 'WestlawNext'::text) AND
(historyeve0_.is_deleted = '0'::numeric))
Filter: (((historyeve0_.event_type)::text <> ALL
('{versionsSearchWithinQueryEvent,notesOfDecisionsSearchWithinQueryEvent,citingReferencesSearchWithinQueryEvent,tocSearchWithinQueryEvent,searchWithinDocumentEvent}'::text[]))
AND (((historyeve0_.product_view)::text = ANY
('{DefaultProductView,TNPPlus,PLCUS,Indigo,INDIGOCA}'::text[])) OR
(historyeve0_.product_view IS NULL)))
Rows Removed by Filter: 5
Buffers: shared hit=85
-> Index Scan using
history_event_display_timesta_prism_guid_display_timestamp_idx1 on
hist2.history_event_display_timestamp_20200701 historyeve0__1
(cost=0.55..1351.59 rows=276 width=13) (actual time=0.029..1.435 rows=800
loops=1)
Output: historyeve0__1.event_type
Index Cond: (((historyeve0__1.prism_guid)::text =
'i0acc051c0133f57f4be2bbbdc5ef'::text) AND
((historyeve0__1.product_sid)::text = 'WestlawNext'::text) AND
(historyeve0__1.is_deleted = '0'::numeric))
Filter: (((historyeve0__1.event_type)::text <> ALL
('{versionsSearchWithinQuery
Re: Query Performance
"Dirschel, Steve" writes: > I have a sample query that is doing more work if some of the reads are > physical reads and I'm trying to understand why. If you look at attached > QueryWithPhyReads.txt it shows the query did Buffers: shared hit=171 > read=880. So it did 171 + 880 = 1051 total block reads (some logical, some > physical). QueryWithNoPhyReads.txt shows execution statistics of the > execution of the exact same query with same data point. The only difference > is the first execution loaded blocks into memory so this execution had all > shared hits. In this case the query did this much work: Buffers: shared > hit=581. You haven't provided a lot of context for this observation, but I can think of at least one explanation for the discrepancy. If the first query was the first access to these tables after a bunch of updates, it would have been visiting a lot of now-dead row versions. It would then have marked the corresponding index entries dead, resulting in the second execution not having to visit as many heap pages. regards, tom lane
Big performance slowdown from 11.2 to 13.3
Hello,
We have a data warehouse working on Postgres V11.2. We have a query that is
pretty beefy that has been taking under 5mn to run consistently every day for
about 2 years as part of a data warehouse ETL process. It's a pivot over 55
values on a table with some 15M rows. The total table size is over 2GB
(table+indices+other).
CREATE TABLE assessmenticcqa_raw
(
iccqar_iccassmt_fk integer NOT NULL, -- foreign key to assessment
iccqar_ques_code character varying(255) COLLATE pg_catalog."default" NOT
NULL, -- question code
iccqar_ans_val character varying(255) COLLATE pg_catalog."default" NOT
NULL, -- answer value
"lastUpdated" timestamp with time zone NOT NULL DEFAULT now(),
CONSTRAINT fk_assessmenticcqa_raw_assessment FOREIGN KEY
(iccqar_iccassmt_fk)
REFERENCES assessmenticc_fact (iccassmt_pk) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE RESTRICT
)
TABLESPACE pg_default;
CREATE UNIQUE INDEX assessmenticcqa_raw_idx_iccqar_assmt_ques
ON assessmenticcqa_raw USING btree
(iccqar_iccassmt_fk ASC NULLS LAST, iccqar_ques_code COLLATE
pg_catalog."default" ASC NULLS LAST)
TABLESPACE pg_default;
CREATE INDEX assessmenticcqa_raw_idx_iccqar_lastupdated
ON assessmenticcqa_raw USING btree
("lastUpdated" ASC NULLS LAST)
TABLESPACE pg_default;
The query that does the pivot is:
WITH t AS (
SELECT assessmenticcqa_raw.iccqar_iccassmt_fk AS iccqa_iccassmt_fk,
assessmenticcqa_raw.iccqar_ques_code,
max(assessmenticcqa_raw.iccqar_ans_val::text) AS iccqar_ans_val
FROM assessmenticcqa_raw
WHERE assessmenticcqa_raw.iccqar_ques_code::text = ANY
(ARRAY['DEBRIDEMENT DATE'::character varying::text
,
'DEBRIDEMENT THIS VISIT'::character varying::text
,
'DEBRIDEMENT TYPE'::character varying::text
, 'DEPTH
(CM)'::character varying::text
, 'DEPTH
DESCRIPTION'::character varying::text
, ... 55
total columns to pivot
])
GROUP BY assessmenticcqa_raw.iccqar_iccassmt_fk,
assessmenticcqa_raw.iccqar_ques_code
)
SELECT t.iccqa_iccassmt_fk,
max(t.iccqar_ans_val) AS iccqar_ans_val,
tilda.todate(max(t.iccqar_ans_val) FILTER (WHERE t.iccqar_ques_code::text =
'DEBRIDEMENT DATE'::text)::character varying, NULL::date) AS
"iccqa_DEBRIDEMENT_DATE",
max(t.iccqar_ans_val) FILTER (WHERE t.iccqar_ques_code::text = 'DEBRIDEMENT
THIS VISIT'::text) AS "iccqa_DEBRIDEMENT_THIS_VISIT",
max(t.iccqar_ans_val) FILTER (WHERE t.iccqar_ques_code::text = 'DEBRIDEMENT
TYPE'::text) AS "iccqa_DEBRIDEMENT_TYPE",
tilda.tofloat(max(t.iccqar_ans_val) FILTER (WHERE t.iccqar_ques_code::text
= 'DEPTH (CM)'::text)::character varying, NULL::real) AS "iccqa_DEPTH_CM",
max(t.iccqar_ans_val) FILTER (WHERE t.iccqar_ques_code::text = 'DEPTH
DESCRIPTION'::text) AS "iccqa_DEPTH_DESCRIPTION",
... 55 total columns being pivotted
FROM t
GROUP BY t.iccqa_iccassmt_fk;
This query has been working flawlessly without so much as a glitch every day
for the last 2 years or so with of course an increasing amount of data every
day (the data grows at about 15-20 thousand records per day). I know the query
is not incremental but at under 5mn, it's simple and works well and can handle
inconsistent updates on the data source we use which is pretty dirty.
The problem I am facing is that we are trying to move to Postgres V13.3 and
this query (and several others like it) is now taking 10x longer (3,000 seconds
vs 300 seconds) which makes it completely unacceptable. I created a V13
instance following standard practices with pg_upgrade. I have V11 and V13
working side by side on the exact same hardware: the VM is an 8-core (16
threads) 64GB windows server 2012 R2 machine with SSD storage. I have vacuumed
both V11 and V13 databases full freeze analyze. The V13 is an exact backup of
the V11 database content-wise. The postgres.conf is the same too and hasn't
been touched in years:
"effective_cache_size": "52GB",
"from_collapse_limit": "24",
"jit": "off",
"jit_above_cost": "2e+08",
"jit_inline_above_cost": "5e+08",
"jit_optimize_above_cost": "5e+08",
"join_collapse_limit": "24",
"max_parallel_workers": "20",
"max_parallel_workers_per_gather": "8",
"random_page_cost": "1.1",
"temp_buffers": "4GB",
"work_mem": "384MB"
I have done all my testing with either of the database on while the other was
off (shutting down the DB) to make sure there wasn't any weird interaction. I
have read some articles about major changes between
Re: Big performance slowdown from 11.2 to 13.3
On Wed, Jul 21, 2021 at 06:50:58PM +, [email protected] wrote: > The plans are pretty much identical too. I checked line by line and couldn't > see anything much different (note that I have a view over this query). Here > is the V13 version of the plan: > I am out of my wits as to what is causing such a massive slowdown and how I > could fix it. > > Any idea out there? Could you send the "explain (analyze,buffers,settings) for query on the v11 and v13 instances ? Or a link to the execution plan pasted into explain.depesz.com. https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN It might be good to check using a copy of your data that there's no regression between 11.2 and 11.12. -- Justin
Re: Big performance slowdown from 11.2 to 13.3
I'm not seeing the valueof the CTE. Why not access assessmenticcqa_raw directly in the main query and only do GROUP BY once? Do you have many values in iccqar_ques_code which are not used in this query? >
RE: Big performance slowdown from 11.2 to 13.3
-Original Message- From: Justin Pryzby Sent: Wednesday, July 21, 2021 15:15 To: [email protected] Cc: [email protected] Subject: Re: Big performance slowdown from 11.2 to 13.3 On Wed, Jul 21, 2021 at 06:50:58PM +, [email protected] wrote: > The plans are pretty much identical too. I checked line by line and couldn't > see anything much different (note that I have a view over this query). Here > is the V13 version of the plan: > I am out of my wits as to what is causing such a massive slowdown and how I > could fix it. > > Any idea out there? Could you send the "explain (analyze,buffers,settings) for query on the v11 and v13 instances ? Or a link to the execution plan pasted into explain.depesz.com. https://wiki.postgresql.org/wiki/Slow_Query_Questions#EXPLAIN_.28ANALYZE.2C_BUFFERS.29.2C_not_just_EXPLAIN It might be good to check using a copy of your data that there's no regression between 11.2 and 11.12. -- Justin My apologies... I thought this is what I had attached in my original email from PGADMIN. In any case, I reran from the command line and here are the two plans. V11.2 explain (analyze,buffers,COSTS,TIMING) HashAggregate (cost=1758361.62..1758372.62 rows=200 width=1260) (actual time=80545.907..161176.867 rows=720950 loops=1) Group Key: t.iccqa_iccassmt_fk Buffers: shared hit=8 read=170093 written=23, temp written=82961 CTE t -> HashAggregate (cost=1338668.50..1352428.93 rows=1376043 width=56) (actual time=23669.075..32038.977 rows=13821646 loops=1) Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, assessmenticcqa_raw.iccqar_ques_code Buffers: shared read=170084 written=23 -> Seq Scan on assessmenticcqa_raw (cost=0.00..1236517.01 rows=13620198 width=38) (actual time=0.081..10525.487 rows=13821646 loops=1) Filter: ((iccqar_ques_code)::text = ANY ('{"DEBRIDEMENT DATE","DEBRIDEMENT THIS VISIT","DEBRIDEMENT TYPE","DEPTH (CM)","DEPTH DESCRIPTION","DOES PATIENT HAVE PAIN ASSOCIATED WITH THIS WOUND?","DRAIN PRESENT","DRAIN TYPE","EDGE / SURROUNDING TISSUE - MACERATION",EDGES,EPITHELIALIZATION,"EXUDATE AMOUNT","EXUDATE TYPE","GRANULATION TISSUE","INDICATE OTHER TYPE OF WOUND CLOSURE","INDICATE TYPE","INDICATE WOUND CLOSURE","IS THIS A CLOSED SURGICAL WOUND OR SUSPECTED DEEP TISSUE INJURY?","LENGTH (CM)","MEASUREMENTS TAKEN","NECROTIC TISSUE AMOUNT","NECROTIC TISSUE TYPE",ODOR,"OTHER COMMENTS REGARDING DEBRIDEMENT TYPE","OTHER COMMENTS REGARDING DRAIN TYPE","OTHER COMMENTS REGARDING PAIN INTERVENTIONS","OTHER COMMENTS REGARDING PAIN QUALITY","OTHER COMMENTS REGARDING REASON MEASUREMENTS NOT TAKEN","PAIN FREQUENCY","PAIN INTERVENTIONS","PAIN QUALITY","PERIPHERAL TISSUE EDEMA","PERIPHERAL TISSUE INDURATION","REASON MEASUREMENTS NOT TAKEN","RESPONSE TO PAIN INTERVENTIONS",SHAPE,"SIGNS AND SYMPTOMS OF INFECTION","SKIN COLOR SURROUNDING WOUND",STATE,"SURFACE AREA (SQ CM)","TOTAL NECROTIC TISSUE ESCHAR","TOTAL NECROTIC TISSUE SLOUGH",TUNNELING,"TUNNELING SIZE(CM)/LOCATION - 12 - 3 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 3 - 6 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 6 - 9 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 9 - 12 O''CLOCK",UNDERMINING,"UNDERMINING SIZE(CM)/LOCATION - 12 - 3 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 3 - 6 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 6 - 9 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 9 - 12 O''CLOCK","WIDTH (CM)","WOUND PAIN LEVEL, WHERE 0 = \"NO PAIN\" AND 10 = \"WORST POSSIBLE PAIN\""}'::text[])) Rows Removed by Filter: 169940 Buffers: shared read=170084 written=23 -> CTE Scan on t (cost=0.00..27520.86 rows=1376043 width=552) (actual time=23669.081..39393.726 rows=13821646 loops=1) Buffers: shared read=170084 written=23, temp written=82961 Planning Time: 6.160 ms Execution Time: 161,942.304 ms V13.3 explain (analyze,buffers,COSTS,TIMING,SETTINGS) == HashAggregate (cost=1774568.21..1774579.21 rows=200 width=1260) (actual time=81053.437..1699800.741 rows=722853 loops=1) Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk Batches: 5 Memory Usage: 284737kB Disk Usage: 60kB Buffers: shared hit=20 read=169851, temp read=185258 written=305014 -> HashAggregate (cost=1360804.75..1374830.63 rows=1402588 width=56) (actual time=24967.655..47587.401 rows=13852618 loops=1) Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, assessmenticcqa_raw.iccqar_ques_code Batches: 21 Memory Usage: 393273kB Disk Usage: 683448kB Buffers: shared read=169851, temp read=110477 written=174216 -> Seq Scan on assessmenticcqa_raw (cost=0.00..1256856.62 rows=13859750 width=38) (actual time=0.104..12406.726 rows=13852618 loops=1) Filter: ((iccqar_ques_code)::text = ANY ('{"DEBRIDEMENT DATE","DEBRIDEMENT THIS VISIT","DEBRIDEMENT TYPE","DEPTH (CM)","DEPTH DES
RE: Big performance slowdown from 11.2 to 13.3
From: Michael Lewis Sent: Wednesday, July 21, 2021 18:12 To: [email protected] Cc: [email protected] Subject: Re: Big performance slowdown from 11.2 to 13.3 I'm not seeing the valueof the CTE. Why not access assessmenticcqa_raw directly in the main query and only do GROUP BY once? Do you have many values in iccqar_ques_code which are not used in this query? Yes, there are close to 600 different values, and we are picking up only a small amount. And by the way, this is a classic case where the query could be folder as a sub-select in a join clause, and I tried this as well, with the same results. Thank you, Laurent.
Re: Big performance slowdown from 11.2 to 13.3
On Wed, Jul 21, 2021 at 4:19 PM [email protected] wrote: > As you can see, the V13.3 execution is about 10x slower. > > It may be hard for me to create a whole copy of the database on 11.12 and > check that environment by itself. I'd want to do it on the same machine to > control variables, and I don't have much extra disk space at the moment. I imagine that this has something to do with the fact that the hash aggregate spills to disk in Postgres 13. You might try increasing hash_mem_multiplier from its default of 1.0, to 2.0 or even 4.0. That way you'd be able to use 2x or 4x more memory for executor nodes that are based on hashing (hash join and hash aggregate), without also affecting other kinds of nodes, which are typically much less sensitive to memory availability. This is very similar to increasing work_mem, except that it is better targeted. It might even make sense to *decrease* work_mem and increase hash_mem_multiplier even further than 4.0. That approach is more aggressive, though, so I wouldn't use it until it actually proved necessary. -- Peter Geoghegan
Re: Big performance slowdown from 11.2 to 13.3
"[email protected]" writes: > My apologies... I thought this is what I had attached in my original email > from PGADMIN. In any case, I reran from the command line and here are the two > plans. So the pain seems to be coming in with the upper hash aggregation, which is spilling to disk because work_mem of '384MB' is nowhere near enough. The v11 explain doesn't show any batching there, which makes me suspect that it was using a larger value of work_mem. (There could also be some edge effect that is making v13 use a bit more memory for the same number of tuples, which could lead it to spill when v11 had managed to scrape by without doing so.) So the first thing I'd try is seeing if setting work_mem to 1GB or so improves matters. The other thing that's notable is that v13 has collapsed out the CTE that used to sit between the two levels of hashagg. Now I don't know of any reason that that wouldn't be a strict improvement, but if the work_mem theory doesn't pan out then that's something that'd deserve a closer look. Does marking the WITH as WITH MATERIALIZED change anything about v13's performance? regards, tom lane
RE: Big performance slowdown from 11.2 to 13.3
-Original Message- From: Peter Geoghegan Sent: Wednesday, July 21, 2021 19:34 To: [email protected] Cc: Justin Pryzby ; [email protected] Subject: Re: Big performance slowdown from 11.2 to 13.3 On Wed, Jul 21, 2021 at 4:19 PM [email protected] wrote: > As you can see, the V13.3 execution is about 10x slower. > > It may be hard for me to create a whole copy of the database on 11.12 and > check that environment by itself. I'd want to do it on the same machine to > control variables, and I don't have much extra disk space at the moment. I imagine that this has something to do with the fact that the hash aggregate spills to disk in Postgres 13. You might try increasing hash_mem_multiplier from its default of 1.0, to 2.0 or even 4.0. That way you'd be able to use 2x or 4x more memory for executor nodes that are based on hashing (hash join and hash aggregate), without also affecting other kinds of nodes, which are typically much less sensitive to memory availability. This is very similar to increasing work_mem, except that it is better targeted. It might even make sense to *decrease* work_mem and increase hash_mem_multiplier even further than 4.0. That approach is more aggressive, though, so I wouldn't use it until it actually proved necessary. -- Peter Geoghegan So how is this happening? I mean, it's the exact same query, looks like the same plan to me, it's the same data on the exact same VM etc... Why is that behavior so different? As soon as I can, I'll check if perhaps the hash_mem_multiplier is somehow set differently between the two setups? That would be my first guess, but absent that, looks like a very different behavior across those 2 versions? Thank you, Laurent.
Re: Big performance slowdown from 11.2 to 13.3
"[email protected]" writes: > From: Peter Geoghegan >> I imagine that this has something to do with the fact that the hash >> aggregate spills to disk in Postgres 13. > So how is this happening? I mean, it's the exact same query, looks like the > same plan to me, it's the same data on the exact same VM etc... Why is that > behavior so different? What Peter's pointing out is that v11 never spilled hashagg hash tables to disk period, no matter how big they got (possibly leading to out-of-memory situations or swapping, but evidently you have enough RAM to have avoided that sort of trouble). I'd momentarily forgotten that, but I think he's dead on about that explaining the difference. As he says, messing with hash_mem_multiplier would be a more targeted fix than increasing work_mem across the board. regards, tom lane
RE: Big performance slowdown from 11.2 to 13.3
-Original Message- From: Tom Lane Sent: Wednesday, July 21, 2021 19:36 To: [email protected] Cc: Justin Pryzby ; [email protected] Subject: Re: Big performance slowdown from 11.2 to 13.3 "[email protected]" writes: > My apologies... I thought this is what I had attached in my original email > from PGADMIN. In any case, I reran from the command line and here are the two > plans. So the pain seems to be coming in with the upper hash aggregation, which is spilling to disk because work_mem of '384MB' is nowhere near enough. The v11 explain doesn't show any batching there, which makes me suspect that it was using a larger value of work_mem. (There could also be some edge effect that is making v13 use a bit more memory for the same number of tuples, which could lead it to spill when v11 had managed to scrape by without doing so.) So the first thing I'd try is seeing if setting work_mem to 1GB or so improves matters. The other thing that's notable is that v13 has collapsed out the CTE that used to sit between the two levels of hashagg. Now I don't know of any reason that that wouldn't be a strict improvement, but if the work_mem theory doesn't pan out then that's something that'd deserve a closer look. Does marking the WITH as WITH MATERIALIZED change anything about v13's performance? regards, tom lane Hello Tom (and Peter)! Thanks for all this info. I created 3 versions of this query: CTE MATERIALIZED, CTE NOT MATERIALIZED, and no CTE (select directly in a sub join). Only very minor change in the final execution time (seconds). I'll try the following later this evening: - set work_mem to 1GB - play with hash_mem_multiplier as per Peter's suggestions although he did suggest to try being more aggressive with it and lower work_mem... so I'll play with those 2 variables. Thank you, Laurent.
RE: Big performance slowdown from 11.2 to 13.3
-Original Message- From: Tom Lane Sent: Wednesday, July 21, 2021 19:43 To: [email protected] Cc: Peter Geoghegan ; Justin Pryzby ; [email protected] Subject: Re: Big performance slowdown from 11.2 to 13.3 "[email protected]" writes: > From: Peter Geoghegan >> I imagine that this has something to do with the fact that the hash >> aggregate spills to disk in Postgres 13. > So how is this happening? I mean, it's the exact same query, looks like the > same plan to me, it's the same data on the exact same VM etc... Why is that > behavior so different? What Peter's pointing out is that v11 never spilled hashagg hash tables to disk period, no matter how big they got (possibly leading to out-of-memory situations or swapping, but evidently you have enough RAM to have avoided that sort of trouble). I'd momentarily forgotten that, but I think he's dead on about that explaining the difference. As he says, messing with hash_mem_multiplier would be a more targeted fix than increasing work_mem across the board. regards, tom lane OK, got it! That sounds and smells good. Will try later tonight or tomorrow and report back. Thank you! Laurent.
RE: Big performance slowdown from 11.2 to 13.3
-Original Message- From: [email protected] Sent: Wednesday, July 21, 2021 19:46 To: Tom Lane Cc: Peter Geoghegan ; Justin Pryzby ; [email protected] Subject: RE: Big performance slowdown from 11.2 to 13.3 -Original Message- From: Tom Lane Sent: Wednesday, July 21, 2021 19:43 To: [email protected] Cc: Peter Geoghegan ; Justin Pryzby ; [email protected] Subject: Re: Big performance slowdown from 11.2 to 13.3 "[email protected]" writes: > From: Peter Geoghegan >> I imagine that this has something to do with the fact that the hash >> aggregate spills to disk in Postgres 13. > So how is this happening? I mean, it's the exact same query, looks like the > same plan to me, it's the same data on the exact same VM etc... Why is that > behavior so different? What Peter's pointing out is that v11 never spilled hashagg hash tables to disk period, no matter how big they got (possibly leading to out-of-memory situations or swapping, but evidently you have enough RAM to have avoided that sort of trouble). I'd momentarily forgotten that, but I think he's dead on about that explaining the difference. As he says, messing with hash_mem_multiplier would be a more targeted fix than increasing work_mem across the board. regards, tom lane OK, got it! That sounds and smells good. Will try later tonight or tomorrow and report back. Thank you! Laurent. Hello all, Seems like no cigar ☹ See plan pasted below. I changed the conf as follows: - hash_mem_multiplier = '2' - work_mem = '1GB' I tried a few other configuration, i.e., 512MB/4, 256MB/8 with similar results. Also, you mentioned previously that the hash was spilling to disk? How are you seeing this in the plans? What should I be looking for on my end when playing around with parameters to see the intended effect? Thank you, Laurent. HashAggregate (cost=1774568.21..1774579.21 rows=200 width=1260) (actual time=70844.078..1554843.323 rows=722853 loops=1) Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk Batches: 1 Memory Usage: 1277985kB Buffers: shared hit=14 read=169854, temp read=15777 written=27588 -> HashAggregate (cost=1360804.75..1374830.63 rows=1402588 width=56) (actual time=23370.026..33839.347 rows=13852618 loops=1) Group Key: assessmenticcqa_raw.iccqar_iccassmt_fk, assessmenticcqa_raw.iccqar_ques_code Batches: 5 Memory Usage: 2400305kB Disk Usage: 126560kB Buffers: shared read=169851, temp read=15777 written=27588 -> Seq Scan on assessmenticcqa_raw (cost=0.00..1256856.62 rows=13859750 width=38) (actual time=0.072..10906.894 rows=13852618 loops=1) Filter: ((iccqar_ques_code)::text = ANY ('{"DEBRIDEMENT DATE","DEBRIDEMENT THIS VISIT","DEBRIDEMENT TYPE","DEPTH (CM)","DEPTH DESCRIPTION","DOES PATIENT HAVE PAIN ASSOCIATED WITH THIS WOUND?","DRAIN PRESENT","DRAIN TYPE","EDGE / SURROUNDING TISSUE - MACERATION",EDGES,EPITHELIALIZATION,"EXUDATE AMOUNT","EXUDATE TYPE","GRANULATION TISSUE","INDICATE OTHER TYPE OF WOUND CLOSURE","INDICATE TYPE","INDICATE WOUND CLOSURE","IS THIS A CLOSED SURGICAL WOUND OR SUSPECTED DEEP TISSUE INJURY?","LENGTH (CM)","MEASUREMENTS TAKEN","NECROTIC TISSUE AMOUNT","NECROTIC TISSUE TYPE",ODOR,"OTHER COMMENTS REGARDING DEBRIDEMENT TYPE","OTHER COMMENTS REGARDING DRAIN TYPE","OTHER COMMENTS REGARDING PAIN INTERVENTIONS","OTHER COMMENTS REGARDING PAIN QUALITY","OTHER COMMENTS REGARDING REASON MEASUREMENTS NOT TAKEN","PAIN FREQUENCY","PAIN INTERVENTIONS","PAIN QUALITY","PERIPHERAL TISSUE EDEMA","PERIPHERAL TISSUE INDURATION","REASON MEASUREMENTS NOT TAKEN","RESPONSE TO PAIN INTERVENTIONS",SHAPE,"SIGNS AND SYMPTOMS OF INFECTION","SKIN COLOR SURROUNDING WOUND",STATE,"SURFACE AREA (SQ CM)","TOTAL NECROTIC TISSUE ESCHAR","TOTAL NECROTIC TISSUE SLOUGH",TUNNELING,"TUNNELING SIZE(CM)/LOCATION - 12 - 3 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 3 - 6 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 6 - 9 O''CLOCK","TUNNELING SIZE(CM)/LOCATION - 9 - 12 O''CLOCK",UNDERMINING,"UNDERMINING SIZE(CM)/LOCATION - 12 - 3 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 3 - 6 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 6 - 9 O''CLOCK","UNDERMINING SIZE(CM)/LOCATION - 9 - 12 O''CLOCK","WIDTH (CM)","WOUND PAIN LEVEL, WHERE 0 = \"NO PAIN\" AND 10 = \"WORST POSSIBLE PAIN\""}'::text[])) Rows Removed by Filter: 171680 Buffers: shared read=169851 Settings: effective_cache_size = '52GB', from_collapse_limit = '24', hash_mem_multiplier = '2', jit = 'off', jit_above_cost = '2e+08', jit_inline_above_cost = '5e+08', jit_optimize_above_cost = '5e+08', join_collapse_limit = '24', max_parallel_workers = '20', max_parallel_workers_per_gather = '8', random_page_cost = '1.1', temp_buffers = '4GB', work_mem = '1GB' Planning: Buffers: shared hit=186 read=37 Planning Time: 3.667 ms Execution Time: 1555300.746 ms
Re: Big performance slowdown from 11.2 to 13.3
On Thu, 22 Jul 2021 at 16:37, [email protected] wrote: > Seems like no cigar ☹ See plan pasted below. I changed the conf as follows: > - hash_mem_multiplier = '2' > - work_mem = '1GB' > Batches: 5 Memory Usage: 2400305kB Disk Usage: 126560kB You might want to keep going higher with hash_mem_multiplier until you see no "Disk Usage" there. As mentioned, v11 didn't spill to disk and just used all the memory it pleased. That was a bit dangerous as it could result in OOM, so it was fixed. David
