RE: Query became very slow after 9.6 -> 10 upgrade
Hello! What about : select name,setting from pg_settings where name like '%_cost'; -- Alex Ignatov Postgres Professional: <http://www.postgrespro.com> http://www.postgrespro.com The Russian Postgres Company From: Dmitry Shalashov [mailto:[email protected]] Sent: Wednesday, November 22, 2017 5:14 PM To: [email protected] Subject: Query became very slow after 9.6 -> 10 upgrade Hi! I've seen few letters like this on mailing list and for some reason thought that probably it won't happen to us, but here I am lol. It's "nestloop hits again" situation. I'll try to provide plan from 9.6 later, but right now I have only plan from 10.1. Query: https://pastebin.com/9b953tT7 It was running under 3 seconds (it's our default timeout) and now it runs for 12 minutes. \d adroom: https://pastebin.com/vBrPGtxT (3800 rows) \d adroom_stat: https://pastebin.com/CkBArCC9 (47mln rows, 1.5mln satisfy condition on day column) \d domains: https://pastebin.com/65hk7YCm (73000 rows) All three tables are analyzed. EXPLAIN ANALYZE: https://pastebin.com/PenHEgf0 EXPLAIN ANALYZE with nestloop off: https://pastebin.com/zX35CPCV (0.8s) Regarding server parameters - it's a mighty beast with 2x E5-2630 v3, 192Gb of RAM and two very, very fast NVME server class SSD's in RAID1. What can I do with it? Also maybe this will be useful: 1st query, runs under 1ms select title, id, groups->0->>'provider' provider, domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and current_timestamp between start_ts and stop_ts 2nd query that uses 1st one, runs under 3 ms select distinct unnest(domain_ids) FROM (select title, id, groups->0->>'provider' provider, domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and current_timestamp between start_ts and stop_ts) t1 3rd query which returns 1.5mln rows, runs in about 0.6s SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', current_timestamp) BUT if I'll add to 3rd query one additional condition, which is basically 2nd query, it will ran same 12 minutes: SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', current_timestamp) AND domain_id IN (select distinct unnest(domain_ids) FROM (select title, id, groups->0->>'provider' provider, domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and current_timestamp between start_ts and stop_ts) t1) Plan of last query: Nested Loop (cost=88.63..25617.31 rows=491 width=16) (actual time=3.512..733248.271 rows=1442797 loops=1) -> HashAggregate (cost=88.06..88.07 rows=1 width=4) (actual time=3.380..13.561 rows=3043 loops=1) Group Key: (unnest(adroom.domain_ids)) -> HashAggregate (cost=88.03..88.04 rows=1 width=4) (actual time=2.199..2.607 rows=3043 loops=1) Group Key: unnest(adroom.domain_ids) -> ProjectSet (cost=0.28..87.78 rows=100 width=4) (actual time=0.701..1.339 rows=3173 loops=1) -> Index Scan using adroom_active_idx on adroom (cost=0.28..87.27 rows=1 width=167) (actual time=0.688..1.040 rows=4 loops=1) Index Cond: ((CURRENT_TIMESTAMP >= start_ts) AND (CURRENT_TIMESTAMP <= stop_ts)) Filter: (((groups -> 0) ->> 'provider'::text) ~ '^target_mail_ru'::text) Rows Removed by Filter: 41 -> Index Scan using adroom_stat_day_adroom_id_domain_id_url_id_is_wlabp_idx on adroom_stat (cost=0.58..25524.33 rows=491 width=16) (actual time=104.847..240.846 rows=474 loops=3043) Index Cond: ((day >= date_trunc('day'::text, (CURRENT_TIMESTAMP - '7 days'::interval))) AND (day <= date_trunc('day'::text, CURRENT_TIMESTAMP)) AND (domain_id = (unnest(adroom.domain_ids Planning time: 1.580 ms Execution time: 71.740 ms Dmitry Shalashov, <http://relap.io/> relap.io & <http://surfingbird.ru> surfingbird.ru
RE: Query became very slow after 9.6 -> 10 upgrade
Here is my select right after initdb: postgres=# select name,setting from pg_settings where name like '%_cost'; name | setting --+- cpu_index_tuple_cost | 0.005 cpu_operator_cost| 0.0025 cpu_tuple_cost | 0.01 parallel_setup_cost | 1000 parallel_tuple_cost | 0.1 random_page_cost | 4 seq_page_cost| 1 Can you generate plan with random_page_cost = 4? -- Alex Ignatov Postgres Professional: <http://www.postgrespro.com> http://www.postgrespro.com The Russian Postgres Company From: Dmitry Shalashov [mailto:[email protected]] Sent: Wednesday, November 22, 2017 5:29 PM To: Alex Ignatov Cc: [email protected] Subject: Re: Query became very slow after 9.6 -> 10 upgrade Sure, here it goes: name | setting --+- cpu_index_tuple_cost | 0.005 cpu_operator_cost| 0.0025 cpu_tuple_cost | 0.01 parallel_setup_cost | 1000 parallel_tuple_cost | 0.1 random_page_cost | 1 seq_page_cost| 1 Dmitry Shalashov, <http://relap.io/> relap.io & <http://surfingbird.ru> surfingbird.ru 2017-11-22 17:24 GMT+03:00 Alex Ignatov mailto:[email protected]> >: Hello! What about : select name,setting from pg_settings where name like '%_cost'; -- Alex Ignatov Postgres Professional: <http://www.postgrespro.com> http://www.postgrespro.com The Russian Postgres Company From: Dmitry Shalashov [mailto:[email protected] <mailto:[email protected]> ] Sent: Wednesday, November 22, 2017 5:14 PM To: [email protected] <mailto:[email protected]> Subject: Query became very slow after 9.6 -> 10 upgrade Hi! I've seen few letters like this on mailing list and for some reason thought that probably it won't happen to us, but here I am lol. It's "nestloop hits again" situation. I'll try to provide plan from 9.6 later, but right now I have only plan from 10.1. Query: https://pastebin.com/9b953tT7 It was running under 3 seconds (it's our default timeout) and now it runs for 12 minutes. \d adroom: https://pastebin.com/vBrPGtxT (3800 rows) \d adroom_stat: https://pastebin.com/CkBArCC9 (47mln rows, 1.5mln satisfy condition on day column) \d domains: https://pastebin.com/65hk7YCm (73000 rows) All three tables are analyzed. EXPLAIN ANALYZE: https://pastebin.com/PenHEgf0 EXPLAIN ANALYZE with nestloop off: https://pastebin.com/zX35CPCV (0.8s) Regarding server parameters - it's a mighty beast with 2x E5-2630 v3, 192Gb of RAM and two very, very fast NVME server class SSD's in RAID1. What can I do with it? Also maybe this will be useful: 1st query, runs under 1ms select title, id, groups->0->>'provider' provider, domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and current_timestamp between start_ts and stop_ts 2nd query that uses 1st one, runs under 3 ms select distinct unnest(domain_ids) FROM (select title, id, groups->0->>'provider' provider, domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and current_timestamp between start_ts and stop_ts) t1 3rd query which returns 1.5mln rows, runs in about 0.6s SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', current_timestamp) BUT if I'll add to 3rd query one additional condition, which is basically 2nd query, it will ran same 12 minutes: SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', current_timestamp) AND domain_id IN (select distinct unnest(domain_ids) FROM (select title, id, groups->0->>'provider' provider, domain_ids from adroom where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and current_timestamp between start_ts and stop_ts) t1) Plan of last query: Nested Loop (cost=88.63..25617.31 rows=491 width=16) (actual time=3.512..733248.271 rows=1442797 loops=1) -> HashAggregate (cost=88.06..88.07 rows=1 width=4) (actual time=3.380..13.561 rows=3043 loops=1) Group Key: (unnest(adroom.domain_ids)) -> HashAggregate (cost=88.03..88.04 rows=1 width=4) (actual time=2.199..2.607 rows=3043 loops=1) Group Key: unnest(adroom.domain_ids) -> ProjectSet (cost=0.28..87.78 rows=100 width=4) (actual time=0.701..1.339 rows=3173 loops=1) ->
RE: Bad estimates
It doesn’t help in this case. -- Alex Ignatov Postgres Professional: <http://www.postgrespro.com> http://www.postgrespro.com The Russian Postgres Company From: Don Seiler [mailto:[email protected]] Sent: Wednesday, November 22, 2017 5:49 PM To: Artur Zając Cc: [email protected] Subject: Re: Bad estimates I'm assuming you never analyzed the table after creation & data load? What does this show you: select * from pg_stat_all_tables where relname='xyz'; Don. -- Don Seiler www.seiler.us <http://www.seiler.us>
RE: pg_xlog unbounded growth
Have you tried archive_command='/bin/true' as Andreas wrote? -Original Message- From: Stefan Petrea [mailto:[email protected]] Sent: Wednesday, January 24, 2018 2:48 PM To: [email protected] Subject: pg_xlog unbounded growth Hello, This email is structured in sections as follows: 1 - Estimating the size of pg_xlog depending on postgresql.conf parameters 2 - Cleaning up pg_xlog using a watchdog script 3 - Mailing list survey of related bugs 4 - Thoughts We're using PostgreSQL 9.6.6 on a Ubuntu 16.04.3 LTS. During some database imports(using pg_restore), we're noticing fast and unbounded growth of pg_xlog up to the point where the partition(280G in size for us) that stores it fills up and PostgreSQL shuts down. The error seen in the logs: 2018-01-17 01:46:23.035 CST [41671] LOG: database system was shut down at 2018-01-16 15:49:26 CST 2018-01-17 01:46:23.038 CST [41671] FATAL: could not write to file "pg_xlog/xlogtemp.41671": No space left on device 2018-01-17 01:46:23.039 CST [41662] LOG: startup process (PID 41671) exited with exit code 1 2018-01-17 01:46:23.039 CST [41662] LOG: aborting startup due to startup process failure 2018-01-17 01:46:23.078 CST [41662] LOG: database system is shut down The config settings I thought were relevant are these ones (but I'm also attaching the entire postgresql.conf if there are other ones that I missed): wal_level=replica archive_command='exit 0;' min_wal_size=2GB max_wal_size=500MB checkpoint_completion_target = 0.7 wal_keep_segments = 8 So currently the pg_xlog is growing a lot, and there doesn't seem to be any way to stop it. There are some formulas I came across that allow one to compute the maximum number of WAL allowed in pg_xlog as a function of the PostgreSQL config parameters. 1.1) Method from 2012 found in [2] The formula for the upper bound for WAL files in pg_xlog is (2 + checkpoint_completion_target) * checkpoint_segments + 1 which is ( (2 + 0.7) * (2048/16 * 1/3 ) ) + 1 ~ 116 WAL files I used the 1/3 because of [6] the shift from checkpoint_segments to max_wal_size in 9.5 , the relevant quote from the release notes being: If you previously adjusted checkpoint_segments, the following formula will give you an approximately equivalent setting: max_wal_size = (3 * checkpoint_segments) * 16MB Another way of computing it, also according to [2] is the following 2 * checkpoint_segments + wal_keep_segments + 1 which is (2048/16) + 8 + 1 = 137 WAL files So far we have two answers, in practice none of them check out, since pg_xlog grows indefinitely. 1.2) Method from the PostgreSQL internals book The book [4] says the following: it could temporarily become up to "3 * checkpoint_segments + 1" Ok, let's compute this too, it's 3 * (128/3) + 1 = 129 WAL files This doesn't check out either. 1.3) On the mailing list [3] , I found similar formulas that were seen previously. 1.4) The post at [5] says max_wal_size is as soft limit and also sets wal_keep_segments = 0 in order to enforce keeping as little WAL as possible around. Would this work? Does wal_keep_segments = 0 turn off WAL recycling? Frankly, I would rather have WAL not be recycled/reused, and just deleted to keep pg_xlog below expected size. Another question is, does wal_level = replica affect the size of pg_xlog in any way? We have an archive_command that just exits with exit code 0, so I don't see any reason for the pg_xlog files to not be cleaned up. 2) Cleaning up pg_xlog using a watchdog script To get the import done I wrote a script that's actually inspired from a blog post where the pg_xlog out of disk space problem is addressed [1]. It periodically reads the last checkpoint's REDO WAL file, and deletes all WAL in pg_xlog before that one. The intended usage is for this script to run alongside the imports in order for pg_xlog to be cleaned up gradually and prevent the disk from filling up. Unlike the blog post and probably slightly wrong is that I used lexicographic ordering and not ordering by date. But I guess it worked because the checks were frequent enough that no WAL ever got recycled. In retrospect I should've used the date ordering. Does this script have the same effect as checkpoint_completion_target=0 ? At the end of the day, this script seems to have allowed the import we needed to get done, but I acknowledge it was a stop-gap measure and not a long-term solution, hence me posting on the mailing list to find a better solution. 3) Mailing list survey of related bugs On the mailing lists, in the past, there have been bugs around pg_xlog growing out of control: BUG 7902 [7] - Discusses a situation where WAL are produced faster than checkpoints can be completed(written to disk), and therefore the WALs in pg_xlog cannot be recycled/deleted. The status of this bug report is unclear. I have a feeling it's still open. Is that the case? BUG 14340 [9] - A user(Sonu Gu
