Slow performance after restoring a dump

2018-03-19 Thread David Osborne
Hi, I was wondering if someone could help us work out why this query is so
slow.

We've just dumped a database (Postgresql 9.1) and restored it to a new
instance (AWS RDS 9.6) (via pg_dump, restored to psql)

We immediately see that the following fairly straightforward query is now
extremely slow with a huge number of shared buffers hit.
On the new instance it takes 25 seconds. On the original system, 0.05
seconds.

The database schemas are both the same (with the same columns indexed) so
I'm guessing it must be a configuration issue to make the planner go down a
different route.
I have run an analyse on the whole database since restoring.

Also, if on the new instance, I disable indexscans, the query will take
0.047 seconds.

Can someone point us in the right direction on what's going on here?


*Query:*

explain (buffers,analyse)  select
trans_date
from stock_trans s
join account_trans a using(account_trans_id)
where product_id=100
and credit_stock_account_id=3
order by trans_date desc
limit 1;


*Bad Performance on New Instance:*

https://explain.depesz.com/s/0HXq



QUERY PLAN
---
 Limit  (cost=0.87..9086.72 rows=1 width=4) (actual
time=25829.287..25829.287 rows=0 loops=1)
   Buffers: shared hit=43944066
   ->  Nested Loop  (cost=0.87..6196547.28 rows=682 width=4) (actual
time=25829.286..25829.286 rows=0 loops=1)
 Buffers: shared hit=43944066
 ->  Index Scan Backward using account_trans_date_idx on
account_trans a  (cost=0.43..392996.60 rows=11455133 width=8) (actual
time=0.007..3401.027 rows=11455133 loops=1)
   Buffers: shared hit=251082
 ->  Index Scan using stock_trans_account_trans_idx on stock_trans
s  (cost=0.43..0.50 rows=1 width=4) (actual time=0.001..0.001 rows=0
loops=11455133)
   Index Cond: (account_trans_id = a.account_trans_id)
   Filter: ((product_id = 2466420) AND (credit_stock_account_id
= 3))
   Rows Removed by Filter: 1
   Buffers: shared hit=43692984
 Planning time: 0.271 ms
 Execution time: 25829.316 ms
(13 rows)


*Disabled indexscan:*

=> set enable_indexscan=off;

https://explain.depesz.com/s/zTVn

  QUERY
PLAN
--
 Limit  (cost=11049.80..11049.81 rows=1 width=4) (actual time=0.018..0.018
rows=0 loops=1)
   Buffers: shared hit=3
   ->  Sort  (cost=11049.80..11051.51 rows=682 width=4) (actual
time=0.017..0.017 rows=0 loops=1)
 Sort Key: a.trans_date DESC
 Sort Method: quicksort  Memory: 25kB
 Buffers: shared hit=3
 ->  Nested Loop  (cost=35.99..11046.39 rows=682 width=4) (actual
time=0.011..0.011 rows=0 loops=1)
   Buffers: shared hit=3
   ->  Bitmap Heap Scan on stock_trans s  (cost=31.59..5301.09
rows=682 width=4) (actual time=0.011..0.011 rows=0 loops=1)
 Recheck Cond: (product_id = 2466420)
 Filter: (credit_stock_account_id = 3)
 Buffers: shared hit=3
 ->  Bitmap Index Scan on stock_trans_product_idx
(cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0
loops=1)
   Index Cond: (product_id = 2466420)
   Buffers: shared hit=3
   ->  Bitmap Heap Scan on account_trans a  (cost=4.40..8.41
rows=1 width=8) (never executed)
 Recheck Cond: (account_trans_id = s.account_trans_id)
 ->  Bitmap Index Scan on account_trans_pkey
(cost=0.00..4.40 rows=1 width=0) (never executed)
   Index Cond: (account_trans_id =
s.account_trans_id)
 Planning time: 0.272 ms
 Execution time: 0.047 ms
(21 rows)


*Explain from the same query on the original database:*

https://explain.depesz.com/s/WHKJ


QUERY PLAN
--
 Limit  (cost=9811.51..9811.52 rows=1 width=4) (actual time=0.020..0.020
rows=0 loops=1)
   Buffers: shared hit=3
   ->  Sort  (cost=9811.51..9813.23 rows=685 width=4) (actual
time=0.019..0.019 rows=0 loops=1)
 Sort Key: a.trans_date
 Sort Method: quicksort  Memory: 25kB
 Buffers: shared hit=3
 ->  Nested Loop  (cost=0.00..9808.09 rows=685 width=4) (actual
time=0.014..0.014 rows=0 loops=1)
   Buffers: shared hit=3
   ->  Index Scan using stock_trans_product_idx on stock_trans
s  (cost=0.00..3300.01 rows=685 width=4) (actual time=0.014..0.014 rows=0
loops=1)
   

Re: Slow performance after restoring a dump

2018-03-19 Thread Tom Lane
David Osborne  writes:
> Hi, I was wondering if someone could help us work out why this query is so
> slow.
> We've just dumped a database (Postgresql 9.1) and restored it to a new
> instance (AWS RDS 9.6) (via pg_dump, restored to psql)

The first question people will ask is did you re-ANALYZE the new
database?  pg_dump doesn't take care of that for you, and auto-analyze
might not think it needs to process the smaller tables.

regards, tom lane



Re: Slow performance after restoring a dump

2018-03-19 Thread David Osborne
Hi, yes I've run "analyse" against the newly restored database. Should that
be enough?

On 19 March 2018 at 15:35, Tom Lane  wrote:

> David Osborne  writes:
>
> The first question people will ask is did you re-ANALYZE the new
> database?  pg_dump doesn't take care of that for you, and auto-analyze
> might not think it needs to process the smaller tables.
>
>


Re: Slow performance after restoring a dump

2018-03-19 Thread Tom Lane
David Osborne  writes:
> Hi, yes I've run "analyse" against the newly restored database. Should that
> be enough?

My apologies, you did say that further down in the original message.
It looks like the core of the problem is the poor rowcount estimation
here:

 ->  Bitmap Index Scan on stock_trans_product_idx 
(cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0 loops=1)
   Index Cond: (product_id = 2466420)
   Buffers: shared hit=3

You might be able to improve that by raising the statistics target
for stock_trans.product_id.  I'm not sure why you weren't getting
bitten by the same issue in 9.1; but the cost estimates aren't
that far apart for the two plans, so maybe you were just lucky ...

regards, tom lane



Re: Slow performance after restoring a dump

2018-03-19 Thread David Osborne
That did the trick... thanks!
yes perhaps a minor planner difference just tipped us over the edge
previously

=> alter table stock_trans alter column product_id set STATISTICS 1000;


QUERY PLAN
--
 Limit  (cost=3461.10..3461.10 rows=1 width=4) (actual time=0.014..0.014
rows=0 loops=1)
   Buffers: shared hit=3
   ->  Sort  (cost=3461.10..3461.75 rows=260 width=4) (actual
time=0.013..0.013 rows=0 loops=1)
 Sort Key: a.trans_date DESC
 Sort Method: quicksort  Memory: 25kB
 Buffers: shared hit=3
 ->  Nested Loop  (cost=0.87..3459.80 rows=260 width=4) (actual
time=0.008..0.008 rows=0 loops=1)
   Buffers: shared hit=3
   ->  Index Scan using stock_trans_product_idx on stock_trans
s  (cost=0.43..1263.55 rows=260 width=4) (actual time=0.007..0.007 rows=0
loops=1)
 Index Cond: (product_id = 2466420)
 Filter: (credit_stock_account_id = 3)
 Buffers: shared hit=3
   ->  Index Scan using account_trans_pkey on account_trans a
(cost=0.43..8.44 rows=1 width=8) (never executed)
 Index Cond: (account_trans_id = s.account_trans_id)
 Planning time: 0.255 ms
 Execution time: 0.039 ms
(16 rows)




On 19 March 2018 at 16:22, Tom Lane  wrote:

> David Osborne  writes:
> > Hi, yes I've run "analyse" against the newly restored database. Should
> that
> > be enough?
>
> My apologies, you did say that further down in the original message.
> It looks like the core of the problem is the poor rowcount estimation
> here:
>
>  ->  Bitmap Index Scan on stock_trans_product_idx
> (cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0
> loops=1)
>Index Cond: (product_id = 2466420)
>Buffers: shared hit=3
>
> You might be able to improve that by raising the statistics target
> for stock_trans.product_id.  I'm not sure why you weren't getting
> bitten by the same issue in 9.1; but the cost estimates aren't
> that far apart for the two plans, so maybe you were just lucky ...
>
> regards, tom lane
>



-- 
David Osborne
Qcode Software Limited
http://www.qcode.co.uk
T: +44 (0)1463 896484


RE: PG 9.6 Slow inserts with long-lasting LWLocks

2018-03-19 Thread Pavel Suderevsky
Michael, thanks for your answer. 
Looks like it is not my case because issue is reproducible also for table with 
100% single writer backend. Also as it was mentioned whole system gets hung. 

Regards,
Pavel Suderevsky

From: MichaelDBA
Sent: Friday, March 16, 2018 16:42
To: Pavel Suderevsky
Cc: [email protected]
Subject: Re: PG 9.6 Slow inserts with long-lasting LWLocks

Sporadic insert slowness could be due to lock delays (locktype=extend) due to 
many concurrent connections trying to insert into the same table at the same 
time. Each insert request may result in an extend lock (8k extension), which 
blocks other writers. What normally happens is the these extend locks happen so 
fast that you hardly ever see them in the pg_locks table, except in the case 
where many concurrent connections are trying to do inserts into the same table. 
The following query will show if this is the case if you execute it during the 
time the problem is occurring.
select * from pg_locks where granted = false and locktype = 'extend';

I don't know if this is your particular problem, but perhaps it is.

Regards,
Michael Vitale

Pavel Suderevsky
Thursday, March 15, 2018 6:29 AM
Hi, 
 
Well, unfortunately I still need community help.
 
-- Environment
OS: Centos CentOS Linux release 7.2.1511
Kernel:  3.10.0-327.36.3.el7.x86_64
PostgreSQL: 9.6.3
-- Hardware
Server: Dell PowerEdge R430
CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Raid controller: PERC H730 Mini (1GB cache)
Disks: 8 x 10K RPM SAS 12GB/s 2.5 (ST1200MM0088) in RAID 6
RAM: 192GB (M393A2G40DB0-CPB x 16)
For more detailed hardware info please see attached configuration.txt
-- postgresql.conf
max_connections = 2048
shared_buffers = 48GB
temp_buffers = 128MB
work_mem = 256MB
maintenance_work_mem = 512MB
dynamic_shared_memory_type = posix
wal_level = hot_standby
min_wal_size = 4GB
max_wal_size = 32GB
huge_pages = on
+
numactl interleave=all
-- sysctl.conf 
kernel.shmmax=64424509440
kernel.shmall=4294967296
kernel.sem = 1024 32767 128 16384
fs.aio-max-nr=3145728
fs.file-max = 6815744
net.core.rmem_default=262144
net.core.rmem_max=4194304
net.core.wmem_default=262144
net.core.wmem_max=1048586
vm.nr_hugepages=33000
vm.dirty_background_bytes=67108864
vm.dirty_bytes=536870912
vm.min_free_kbytes=1048576
zone_reclaim_mode=0
 
Again: problem is the occasional long inserts that can happen 1-5 times per day 
on OLTP system.
No autovacuum performed during long inserts. WAL rate is 1-2Gb per hour, no 
correlation spotted with this issue.
Wait event "buffer_mapping" happen for appropriate transactions but not every 
time (maybe just not every time catched).
I have two suspects for such behaviour: I/O system and high concurrency.
There is a problem with one application that frequently recreates up to 90 
sessions but investigation shows that there is no direct correlation between 
such sessions and long transactions, at least it is not the root cause of the 
issue (of course such app behaviour will be fixed).
 
The investigation and tracing with strace in particular showed that:
1. The only long event straced from postgres backends was <... semop resumed>.
2. Seems the whole host gets hung during such events. 
 
Example:
Java application located on separate host reports several long transactions:
123336.943 - [1239588mks]: event.insert-table
123336.943 - [1240827mks]: event.insert-table
123337.019 - [1292534mks]: event.insert-table
143353.542 - [5467657mks]: event.insert-table
143353.543 - [5468884mks]: event.insert-table
152338.763 - [1264588mks]: event.insert-table
152338.765 - [2054887mks]: event.insert-table
 
Strace output for event happened at 14:33 with particular known pid:
119971 14:33:48.075375 epoll_wait(3,  
119971 14:33:48.075696 <... epoll_wait resumed> {{EPOLLIN, {u32=27532016, 
u64=27532016}}}, 1, -1) = 1 <0.000313>
119971 14:33:48.075792 recvfrom(9,  
119971 14:33:48.075866 <... recvfrom resumed> 
"B\0\0\3\27\0S_21\0\0*\0\1\0\1\0\1\0\0\0\0\0\1\0\1\0\0\0\0\0"..., 8192, 0, 
NULL, NULL) = 807 <0.66>
119971 14:33:48.076243 semop(26706044, {{8, -1, 0}}, 1 
120019 14:33:48.119971 recvfrom(9,  
119971 14:33:53.491029 <... semop resumed> ) = 0 <5.414772> 
119971 14:33:53.500356 lseek(18, 0, SEEK_END 
119971 14:33:53.500436 <... lseek resumed> ) = 107790336 <0.72>
119971 14:33:53.500514 lseek(20, 0, SEEK_END 
 
Checking strace long semop calls for whole day:
root@host [20180314 17:47:36]:/home/user$ egrep " <[1-9]." /tmp/strace | grep 
semop
119991 12:33:36 <... semop resumed> )   = 0 <1.419394>
119942 12:33:36 <... semop resumed> )   = 0 <1.422554>
119930 12:33:36 <... semop resumed> )   = 0 <1.414916>
119988 12:33:36 <... semop resumed> )   = 0 <1.213309>
119966 12:33:36 <... semop resumed> )   = 0 <1.237492>
119958 14:33:53.489398 <... semop resumed> ) = 0 <5.455830>
120019 14:33:53.490613 <... semop resumed> ) = 0 <5.284505>
119997 14:33:53.490638 <... semop resumed> ) = 0 <5.111661>
12 14:33:53.490649 <... semop resumed> ) = 0 <3.521992>
119991