RE: Query became very slow after 9.6 -> 10 upgrade

2017-11-22 Thread Alex Ignatov
Hello!

What about :

select name,setting from pg_settings where name like '%_cost';

 

--

Alex Ignatov 
Postgres Professional:  <http://www.postgrespro.com> http://www.postgrespro.com 
The Russian Postgres Company

 

 

From: Dmitry Shalashov [mailto:[email protected]] 
Sent: Wednesday, November 22, 2017 5:14 PM
To: [email protected]
Subject: Query became very slow after 9.6 -> 10 upgrade

 

Hi!

 

I've seen few letters like this on mailing list and for some reason thought 
that probably it won't happen to us, but here I am lol.

 

It's "nestloop hits again" situation.

 

I'll try to provide plan from 9.6 later, but right now I have only plan from 
10.1.

 

Query: https://pastebin.com/9b953tT7

It was running under 3 seconds (it's our default timeout) and now it runs for 
12 minutes.

 

\d adroom: https://pastebin.com/vBrPGtxT (3800 rows)

\d adroom_stat: https://pastebin.com/CkBArCC9 (47mln rows, 1.5mln satisfy 
condition on day column)

\d domains: https://pastebin.com/65hk7YCm (73000 rows)

 

All three tables are analyzed.

 

EXPLAIN ANALYZE: https://pastebin.com/PenHEgf0

EXPLAIN ANALYZE with nestloop off: https://pastebin.com/zX35CPCV (0.8s)

 

Regarding server parameters - it's a mighty beast with 2x E5-2630 v3, 192Gb of 
RAM and two very, very fast NVME server class SSD's in RAID1.

 

What can I do with it?

 

 

Also maybe this will be useful:

 

1st query, runs under 1ms

select title, id, groups->0->>'provider' provider, domain_ids from adroom where 
groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and 
current_timestamp between start_ts and stop_ts

 

2nd query that uses 1st one, runs under 3 ms

select distinct unnest(domain_ids) FROM (select title, id, 
groups->0->>'provider' provider, domain_ids from adroom where 
groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and 
current_timestamp between start_ts and stop_ts) t1

 

3rd query which returns 1.5mln rows, runs in about 0.6s

SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between 
date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', 
current_timestamp)

 

BUT if I'll add to 3rd query one additional condition, which is basically 2nd 
query, it will ran same 12 minutes:

SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between 
date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', 
current_timestamp) AND domain_id IN (select distinct unnest(domain_ids) FROM 
(select title, id, groups->0->>'provider' provider, domain_ids from adroom 
where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and 
current_timestamp between start_ts and stop_ts) t1)

 

Plan of last query:

 Nested Loop  (cost=88.63..25617.31 rows=491 width=16) (actual 
time=3.512..733248.271 rows=1442797 loops=1)

   ->  HashAggregate  (cost=88.06..88.07 rows=1 width=4) (actual 
time=3.380..13.561 rows=3043 loops=1)

 Group Key: (unnest(adroom.domain_ids))

 ->  HashAggregate  (cost=88.03..88.04 rows=1 width=4) (actual 
time=2.199..2.607 rows=3043 loops=1)

   Group Key: unnest(adroom.domain_ids)

   ->  ProjectSet  (cost=0.28..87.78 rows=100 width=4) (actual 
time=0.701..1.339 rows=3173 loops=1)

 ->  Index Scan using adroom_active_idx on adroom  
(cost=0.28..87.27 rows=1 width=167) (actual time=0.688..1.040 rows=4 loops=1)

   Index Cond: ((CURRENT_TIMESTAMP >= start_ts) AND 
(CURRENT_TIMESTAMP <= stop_ts))

   Filter: (((groups -> 0) ->> 'provider'::text) ~ 
'^target_mail_ru'::text)

   Rows Removed by Filter: 41

   ->  Index Scan using adroom_stat_day_adroom_id_domain_id_url_id_is_wlabp_idx 
on adroom_stat  (cost=0.58..25524.33 rows=491 width=16) (actual 
time=104.847..240.846 rows=474 loops=3043)

 Index Cond: ((day >= date_trunc('day'::text, (CURRENT_TIMESTAMP - '7 
days'::interval))) AND (day <= date_trunc('day'::text, CURRENT_TIMESTAMP)) AND 
(domain_id = (unnest(adroom.domain_ids

 Planning time: 1.580 ms

 Execution time: 71.740 ms

 

Dmitry Shalashov,  <http://relap.io/> relap.io &  <http://surfingbird.ru> 
surfingbird.ru



RE: Query became very slow after 9.6 -> 10 upgrade

2017-11-22 Thread Alex Ignatov
Here is my select right after initdb:

 

postgres=# select name,setting from pg_settings where name like '%_cost';

 name | setting

--+-

cpu_index_tuple_cost | 0.005

cpu_operator_cost| 0.0025

cpu_tuple_cost   | 0.01

parallel_setup_cost  | 1000

parallel_tuple_cost  | 0.1

random_page_cost | 4

seq_page_cost| 1

 

 

Can you generate plan with random_page_cost = 4?

 

 

--

Alex Ignatov 
Postgres Professional:  <http://www.postgrespro.com> http://www.postgrespro.com 
The Russian Postgres Company

 

From: Dmitry Shalashov [mailto:[email protected]] 
Sent: Wednesday, November 22, 2017 5:29 PM
To: Alex Ignatov 
Cc: [email protected]
Subject: Re: Query became very slow after 9.6 -> 10 upgrade

 

Sure, here it goes:

 

 name | setting

--+-

 cpu_index_tuple_cost | 0.005

 cpu_operator_cost| 0.0025

 cpu_tuple_cost   | 0.01

 parallel_setup_cost  | 1000

 parallel_tuple_cost  | 0.1

 random_page_cost | 1

 seq_page_cost| 1




 

Dmitry Shalashov,  <http://relap.io/> relap.io &  <http://surfingbird.ru> 
surfingbird.ru

 

2017-11-22 17:24 GMT+03:00 Alex Ignatov mailto:[email protected]> >:

Hello!

What about :

select name,setting from pg_settings where name like '%_cost';

 

--

Alex Ignatov 
Postgres Professional:  <http://www.postgrespro.com> http://www.postgrespro.com 
The Russian Postgres Company

 

 

From: Dmitry Shalashov [mailto:[email protected] <mailto:[email protected]> ] 
Sent: Wednesday, November 22, 2017 5:14 PM
To: [email protected] <mailto:[email protected]> 
Subject: Query became very slow after 9.6 -> 10 upgrade

 

Hi!

 

I've seen few letters like this on mailing list and for some reason thought 
that probably it won't happen to us, but here I am lol.

 

It's "nestloop hits again" situation.

 

I'll try to provide plan from 9.6 later, but right now I have only plan from 
10.1.

 

Query: https://pastebin.com/9b953tT7

It was running under 3 seconds (it's our default timeout) and now it runs for 
12 minutes.

 

\d adroom: https://pastebin.com/vBrPGtxT (3800 rows)

\d adroom_stat: https://pastebin.com/CkBArCC9 (47mln rows, 1.5mln satisfy 
condition on day column)

\d domains: https://pastebin.com/65hk7YCm (73000 rows)

 

All three tables are analyzed.

 

EXPLAIN ANALYZE: https://pastebin.com/PenHEgf0

EXPLAIN ANALYZE with nestloop off: https://pastebin.com/zX35CPCV (0.8s)

 

Regarding server parameters - it's a mighty beast with 2x E5-2630 v3, 192Gb of 
RAM and two very, very fast NVME server class SSD's in RAID1.

 

What can I do with it?

 

 

Also maybe this will be useful:

 

1st query, runs under 1ms

select title, id, groups->0->>'provider' provider, domain_ids from adroom where 
groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and 
current_timestamp between start_ts and stop_ts

 

2nd query that uses 1st one, runs under 3 ms

select distinct unnest(domain_ids) FROM (select title, id, 
groups->0->>'provider' provider, domain_ids from adroom where 
groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and 
current_timestamp between start_ts and stop_ts) t1

 

3rd query which returns 1.5mln rows, runs in about 0.6s

SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between 
date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', 
current_timestamp)

 

BUT if I'll add to 3rd query one additional condition, which is basically 2nd 
query, it will ran same 12 minutes:

SELECT adroom_id, domain_id, shows, clicks FROM adroom_stat WHERE day between 
date_trunc('day', current_timestamp - interval '1 week') and date_trunc('day', 
current_timestamp) AND domain_id IN (select distinct unnest(domain_ids) FROM 
(select title, id, groups->0->>'provider' provider, domain_ids from adroom 
where groups->0->>'provider' ~ '^target_mail_ru' and not is_paused and 
current_timestamp between start_ts and stop_ts) t1)

 

Plan of last query:

 Nested Loop  (cost=88.63..25617.31 rows=491 width=16) (actual 
time=3.512..733248.271 rows=1442797 loops=1)

   ->  HashAggregate  (cost=88.06..88.07 rows=1 width=4) (actual 
time=3.380..13.561 rows=3043 loops=1)

 Group Key: (unnest(adroom.domain_ids))

 ->  HashAggregate  (cost=88.03..88.04 rows=1 width=4) (actual 
time=2.199..2.607 rows=3043 loops=1)

   Group Key: unnest(adroom.domain_ids)

   ->  ProjectSet  (cost=0.28..87.78 rows=100 width=4) (actual 
time=0.701..1.339 rows=3173 loops=1)

 ->

RE: Bad estimates

2017-11-22 Thread Alex Ignatov
It doesn’t help in this case.

 

--

Alex Ignatov 
Postgres Professional:  <http://www.postgrespro.com> http://www.postgrespro.com 
The Russian Postgres Company

 

From: Don Seiler [mailto:[email protected]] 
Sent: Wednesday, November 22, 2017 5:49 PM
To: Artur Zając 
Cc: [email protected]
Subject: Re: Bad estimates

 

I'm assuming you never analyzed the table after creation & data load? What does 
this show you:

 

select * from pg_stat_all_tables where relname='xyz';

 

Don.

 

-- 

Don Seiler
www.seiler.us <http://www.seiler.us> 



RE: pg_xlog unbounded growth

2018-02-07 Thread Alex Ignatov
Have you tried 
archive_command='/bin/true' 
as Andreas wrote?

-Original Message-
From: Stefan Petrea [mailto:[email protected]] 
Sent: Wednesday, January 24, 2018 2:48 PM
To: [email protected]
Subject: pg_xlog unbounded growth

Hello,

This email is structured in sections as follows:

1 - Estimating the size of pg_xlog depending on postgresql.conf parameters
2 - Cleaning up pg_xlog using a watchdog script
3 - Mailing list survey of related bugs
4 - Thoughts

We're using PostgreSQL 9.6.6 on a Ubuntu 16.04.3 LTS.
During some database imports(using pg_restore), we're noticing fast and
unbounded growth of pg_xlog up to the point where the partition(280G in size
for us) that stores it fills up and PostgreSQL shuts down. The error seen in
the logs:

2018-01-17 01:46:23.035 CST [41671] LOG:  database system was shut down
at 2018-01-16 15:49:26 CST
2018-01-17 01:46:23.038 CST [41671] FATAL:  could not write to file
"pg_xlog/xlogtemp.41671": No space left on device
2018-01-17 01:46:23.039 CST [41662] LOG:  startup process (PID 41671)
exited with exit code 1
2018-01-17 01:46:23.039 CST [41662] LOG:  aborting startup due to
startup process failure
2018-01-17 01:46:23.078 CST [41662] LOG:  database system is shut down

The config settings I thought were relevant are these ones (but I'm also
attaching the entire postgresql.conf if there are other ones that I missed):

wal_level=replica
archive_command='exit 0;'
min_wal_size=2GB
max_wal_size=500MB
checkpoint_completion_target = 0.7
wal_keep_segments = 8

So currently the pg_xlog is growing a lot, and there doesn't seem to be any
way to stop it.

There are some formulas I came across that allow one to compute the maximum
number of WAL allowed in pg_xlog as a function of the PostgreSQL config
parameters.

1.1) Method from 2012 found in [2]

The formula for the upper bound for WAL files in pg_xlog is 

(2 + checkpoint_completion_target) * checkpoint_segments + 1 which is ( (2 +
0.7) * (2048/16 * 1/3 ) ) + 1 ~ 116 WAL files

I used the 1/3 because of [6] the shift from checkpoint_segments to
max_wal_size in 9.5 , the relevant quote from the release notes being:

If you previously adjusted checkpoint_segments, the following formula
will give you an approximately equivalent setting:
max_wal_size = (3 * checkpoint_segments) * 16MB

Another way of computing it, also according to [2] is the following
2 * checkpoint_segments + wal_keep_segments + 1 which is (2048/16) + 8 + 1 =
137  WAL files

So far we have two answers, in practice none of them check out, since
pg_xlog grows indefinitely.

1.2) Method from the PostgreSQL internals book 

The book [4] says the following:

it could temporarily become up to "3 * checkpoint_segments + 1"

Ok, let's compute this too, it's 3 * (128/3) + 1 = 129 WAL files

This doesn't check out either.

1.3) On the mailing list [3] , I found similar formulas that were seen
previously.

1.4) The post at [5] says max_wal_size is as soft limit and also sets
wal_keep_segments = 0 in order to enforce keeping as little WAL as possible
around.  Would this work?

Does wal_keep_segments = 0 turn off WAL recycling? Frankly, I would rather
have WAL not be recycled/reused, and just deleted to keep pg_xlog below
expected size.

Another question is, does wal_level = replica affect the size of pg_xlog in
any way?  We have an archive_command that just exits with exit code 0, so I
don't see any reason for the pg_xlog files to not be cleaned up.

2) Cleaning up pg_xlog using a watchdog script

To get the import done I wrote a script that's actually inspired from a blog
post where the pg_xlog out of disk space problem is addressed [1].  It
periodically reads the last checkpoint's REDO WAL file, and deletes all WAL
in pg_xlog before that one. 

The intended usage is for this script to run alongside the imports in order
for pg_xlog to be cleaned up gradually and prevent the disk from filling up.

Unlike the blog post and probably slightly wrong is that I used
lexicographic ordering and not ordering by date. But I guess it worked
because the checks were frequent enough that no WAL ever got recycled. In
retrospect I should've used the date ordering.

Does this script have the same effect as checkpoint_completion_target=0 ?

At the end of the day, this script seems to have allowed the import we
needed to get done, but I acknowledge it was a stop-gap measure and not a
long-term solution, hence me posting on the mailing list to find a better
solution.

3) Mailing list survey of related bugs

On the mailing lists, in the past, there have been bugs around pg_xlog
growing out of control:

BUG 7902 [7] - Discusses a situation where WAL are produced faster than
checkpoints can be completed(written to disk), and therefore the WALs in
pg_xlog cannot be recycled/deleted.  The status of this bug report is
unclear. I have a feeling it's still open. Is that the case?

BUG 14340 [9] - A user(Sonu Gu