Slow performance after restoring a dump
Hi, I was wondering if someone could help us work out why this query is so slow. We've just dumped a database (Postgresql 9.1) and restored it to a new instance (AWS RDS 9.6) (via pg_dump, restored to psql) We immediately see that the following fairly straightforward query is now extremely slow with a huge number of shared buffers hit. On the new instance it takes 25 seconds. On the original system, 0.05 seconds. The database schemas are both the same (with the same columns indexed) so I'm guessing it must be a configuration issue to make the planner go down a different route. I have run an analyse on the whole database since restoring. Also, if on the new instance, I disable indexscans, the query will take 0.047 seconds. Can someone point us in the right direction on what's going on here? *Query:* explain (buffers,analyse) select trans_date from stock_trans s join account_trans a using(account_trans_id) where product_id=100 and credit_stock_account_id=3 order by trans_date desc limit 1; *Bad Performance on New Instance:* https://explain.depesz.com/s/0HXq QUERY PLAN --- Limit (cost=0.87..9086.72 rows=1 width=4) (actual time=25829.287..25829.287 rows=0 loops=1) Buffers: shared hit=43944066 -> Nested Loop (cost=0.87..6196547.28 rows=682 width=4) (actual time=25829.286..25829.286 rows=0 loops=1) Buffers: shared hit=43944066 -> Index Scan Backward using account_trans_date_idx on account_trans a (cost=0.43..392996.60 rows=11455133 width=8) (actual time=0.007..3401.027 rows=11455133 loops=1) Buffers: shared hit=251082 -> Index Scan using stock_trans_account_trans_idx on stock_trans s (cost=0.43..0.50 rows=1 width=4) (actual time=0.001..0.001 rows=0 loops=11455133) Index Cond: (account_trans_id = a.account_trans_id) Filter: ((product_id = 2466420) AND (credit_stock_account_id = 3)) Rows Removed by Filter: 1 Buffers: shared hit=43692984 Planning time: 0.271 ms Execution time: 25829.316 ms (13 rows) *Disabled indexscan:* => set enable_indexscan=off; https://explain.depesz.com/s/zTVn QUERY PLAN -- Limit (cost=11049.80..11049.81 rows=1 width=4) (actual time=0.018..0.018 rows=0 loops=1) Buffers: shared hit=3 -> Sort (cost=11049.80..11051.51 rows=682 width=4) (actual time=0.017..0.017 rows=0 loops=1) Sort Key: a.trans_date DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=3 -> Nested Loop (cost=35.99..11046.39 rows=682 width=4) (actual time=0.011..0.011 rows=0 loops=1) Buffers: shared hit=3 -> Bitmap Heap Scan on stock_trans s (cost=31.59..5301.09 rows=682 width=4) (actual time=0.011..0.011 rows=0 loops=1) Recheck Cond: (product_id = 2466420) Filter: (credit_stock_account_id = 3) Buffers: shared hit=3 -> Bitmap Index Scan on stock_trans_product_idx (cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0 loops=1) Index Cond: (product_id = 2466420) Buffers: shared hit=3 -> Bitmap Heap Scan on account_trans a (cost=4.40..8.41 rows=1 width=8) (never executed) Recheck Cond: (account_trans_id = s.account_trans_id) -> Bitmap Index Scan on account_trans_pkey (cost=0.00..4.40 rows=1 width=0) (never executed) Index Cond: (account_trans_id = s.account_trans_id) Planning time: 0.272 ms Execution time: 0.047 ms (21 rows) *Explain from the same query on the original database:* https://explain.depesz.com/s/WHKJ QUERY PLAN -- Limit (cost=9811.51..9811.52 rows=1 width=4) (actual time=0.020..0.020 rows=0 loops=1) Buffers: shared hit=3 -> Sort (cost=9811.51..9813.23 rows=685 width=4) (actual time=0.019..0.019 rows=0 loops=1) Sort Key: a.trans_date Sort Method: quicksort Memory: 25kB Buffers: shared hit=3 -> Nested Loop (cost=0.00..9808.09 rows=685 width=4) (actual time=0.014..0.014 rows=0 loops=1) Buffers: shared hit=3 -> Index Scan using stock_trans_product_idx on stock_trans s (cost=0.00..3300.01 rows=685 width=4) (actual time=0.014..0.014 rows=0 loops=1)
Re: Slow performance after restoring a dump
David Osborne writes: > Hi, I was wondering if someone could help us work out why this query is so > slow. > We've just dumped a database (Postgresql 9.1) and restored it to a new > instance (AWS RDS 9.6) (via pg_dump, restored to psql) The first question people will ask is did you re-ANALYZE the new database? pg_dump doesn't take care of that for you, and auto-analyze might not think it needs to process the smaller tables. regards, tom lane
Re: Slow performance after restoring a dump
Hi, yes I've run "analyse" against the newly restored database. Should that be enough? On 19 March 2018 at 15:35, Tom Lane wrote: > David Osborne writes: > > The first question people will ask is did you re-ANALYZE the new > database? pg_dump doesn't take care of that for you, and auto-analyze > might not think it needs to process the smaller tables. > >
Re: Slow performance after restoring a dump
David Osborne writes: > Hi, yes I've run "analyse" against the newly restored database. Should that > be enough? My apologies, you did say that further down in the original message. It looks like the core of the problem is the poor rowcount estimation here: -> Bitmap Index Scan on stock_trans_product_idx (cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0 loops=1) Index Cond: (product_id = 2466420) Buffers: shared hit=3 You might be able to improve that by raising the statistics target for stock_trans.product_id. I'm not sure why you weren't getting bitten by the same issue in 9.1; but the cost estimates aren't that far apart for the two plans, so maybe you were just lucky ... regards, tom lane
Re: Slow performance after restoring a dump
That did the trick... thanks! yes perhaps a minor planner difference just tipped us over the edge previously => alter table stock_trans alter column product_id set STATISTICS 1000; QUERY PLAN -- Limit (cost=3461.10..3461.10 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1) Buffers: shared hit=3 -> Sort (cost=3461.10..3461.75 rows=260 width=4) (actual time=0.013..0.013 rows=0 loops=1) Sort Key: a.trans_date DESC Sort Method: quicksort Memory: 25kB Buffers: shared hit=3 -> Nested Loop (cost=0.87..3459.80 rows=260 width=4) (actual time=0.008..0.008 rows=0 loops=1) Buffers: shared hit=3 -> Index Scan using stock_trans_product_idx on stock_trans s (cost=0.43..1263.55 rows=260 width=4) (actual time=0.007..0.007 rows=0 loops=1) Index Cond: (product_id = 2466420) Filter: (credit_stock_account_id = 3) Buffers: shared hit=3 -> Index Scan using account_trans_pkey on account_trans a (cost=0.43..8.44 rows=1 width=8) (never executed) Index Cond: (account_trans_id = s.account_trans_id) Planning time: 0.255 ms Execution time: 0.039 ms (16 rows) On 19 March 2018 at 16:22, Tom Lane wrote: > David Osborne writes: > > Hi, yes I've run "analyse" against the newly restored database. Should > that > > be enough? > > My apologies, you did say that further down in the original message. > It looks like the core of the problem is the poor rowcount estimation > here: > > -> Bitmap Index Scan on stock_trans_product_idx > (cost=0.00..31.42 rows=1465 width=0) (actual time=0.009..0.009 rows=0 > loops=1) >Index Cond: (product_id = 2466420) >Buffers: shared hit=3 > > You might be able to improve that by raising the statistics target > for stock_trans.product_id. I'm not sure why you weren't getting > bitten by the same issue in 9.1; but the cost estimates aren't > that far apart for the two plans, so maybe you were just lucky ... > > regards, tom lane > -- David Osborne Qcode Software Limited http://www.qcode.co.uk T: +44 (0)1463 896484
RE: PG 9.6 Slow inserts with long-lasting LWLocks
Michael, thanks for your answer. Looks like it is not my case because issue is reproducible also for table with 100% single writer backend. Also as it was mentioned whole system gets hung. Regards, Pavel Suderevsky From: MichaelDBA Sent: Friday, March 16, 2018 16:42 To: Pavel Suderevsky Cc: [email protected] Subject: Re: PG 9.6 Slow inserts with long-lasting LWLocks Sporadic insert slowness could be due to lock delays (locktype=extend) due to many concurrent connections trying to insert into the same table at the same time. Each insert request may result in an extend lock (8k extension), which blocks other writers. What normally happens is the these extend locks happen so fast that you hardly ever see them in the pg_locks table, except in the case where many concurrent connections are trying to do inserts into the same table. The following query will show if this is the case if you execute it during the time the problem is occurring. select * from pg_locks where granted = false and locktype = 'extend'; I don't know if this is your particular problem, but perhaps it is. Regards, Michael Vitale Pavel Suderevsky Thursday, March 15, 2018 6:29 AM Hi, Well, unfortunately I still need community help. -- Environment OS: Centos CentOS Linux release 7.2.1511 Kernel: 3.10.0-327.36.3.el7.x86_64 PostgreSQL: 9.6.3 -- Hardware Server: Dell PowerEdge R430 CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Raid controller: PERC H730 Mini (1GB cache) Disks: 8 x 10K RPM SAS 12GB/s 2.5 (ST1200MM0088) in RAID 6 RAM: 192GB (M393A2G40DB0-CPB x 16) For more detailed hardware info please see attached configuration.txt -- postgresql.conf max_connections = 2048 shared_buffers = 48GB temp_buffers = 128MB work_mem = 256MB maintenance_work_mem = 512MB dynamic_shared_memory_type = posix wal_level = hot_standby min_wal_size = 4GB max_wal_size = 32GB huge_pages = on + numactl interleave=all -- sysctl.conf kernel.shmmax=64424509440 kernel.shmall=4294967296 kernel.sem = 1024 32767 128 16384 fs.aio-max-nr=3145728 fs.file-max = 6815744 net.core.rmem_default=262144 net.core.rmem_max=4194304 net.core.wmem_default=262144 net.core.wmem_max=1048586 vm.nr_hugepages=33000 vm.dirty_background_bytes=67108864 vm.dirty_bytes=536870912 vm.min_free_kbytes=1048576 zone_reclaim_mode=0 Again: problem is the occasional long inserts that can happen 1-5 times per day on OLTP system. No autovacuum performed during long inserts. WAL rate is 1-2Gb per hour, no correlation spotted with this issue. Wait event "buffer_mapping" happen for appropriate transactions but not every time (maybe just not every time catched). I have two suspects for such behaviour: I/O system and high concurrency. There is a problem with one application that frequently recreates up to 90 sessions but investigation shows that there is no direct correlation between such sessions and long transactions, at least it is not the root cause of the issue (of course such app behaviour will be fixed). The investigation and tracing with strace in particular showed that: 1. The only long event straced from postgres backends was <... semop resumed>. 2. Seems the whole host gets hung during such events. Example: Java application located on separate host reports several long transactions: 123336.943 - [1239588mks]: event.insert-table 123336.943 - [1240827mks]: event.insert-table 123337.019 - [1292534mks]: event.insert-table 143353.542 - [5467657mks]: event.insert-table 143353.543 - [5468884mks]: event.insert-table 152338.763 - [1264588mks]: event.insert-table 152338.765 - [2054887mks]: event.insert-table Strace output for event happened at 14:33 with particular known pid: 119971 14:33:48.075375 epoll_wait(3, 119971 14:33:48.075696 <... epoll_wait resumed> {{EPOLLIN, {u32=27532016, u64=27532016}}}, 1, -1) = 1 <0.000313> 119971 14:33:48.075792 recvfrom(9, 119971 14:33:48.075866 <... recvfrom resumed> "B\0\0\3\27\0S_21\0\0*\0\1\0\1\0\1\0\0\0\0\0\1\0\1\0\0\0\0\0"..., 8192, 0, NULL, NULL) = 807 <0.66> 119971 14:33:48.076243 semop(26706044, {{8, -1, 0}}, 1 120019 14:33:48.119971 recvfrom(9, 119971 14:33:53.491029 <... semop resumed> ) = 0 <5.414772> 119971 14:33:53.500356 lseek(18, 0, SEEK_END 119971 14:33:53.500436 <... lseek resumed> ) = 107790336 <0.72> 119971 14:33:53.500514 lseek(20, 0, SEEK_END Checking strace long semop calls for whole day: root@host [20180314 17:47:36]:/home/user$ egrep " <[1-9]." /tmp/strace | grep semop 119991 12:33:36 <... semop resumed> ) = 0 <1.419394> 119942 12:33:36 <... semop resumed> ) = 0 <1.422554> 119930 12:33:36 <... semop resumed> ) = 0 <1.414916> 119988 12:33:36 <... semop resumed> ) = 0 <1.213309> 119966 12:33:36 <... semop resumed> ) = 0 <1.237492> 119958 14:33:53.489398 <... semop resumed> ) = 0 <5.455830> 120019 14:33:53.490613 <... semop resumed> ) = 0 <5.284505> 119997 14:33:53.490638 <... semop resumed> ) = 0 <5.111661> 12 14:33:53.490649 <... semop resumed> ) = 0 <3.521992> 119991
