much slower query in production

2020-02-26 Thread Guillaume Cottenceau
Dear all,

I am facing a much, much slower query in production than on my
development computer using a restored production backup, and I
don't understand why nor I see what I could do to speedup the
query on production :/

Short data model explanation: one table stores tickets, one table
stores multicards; tickets can belong to a multicard through a
many-to-one relationship, or be independant from multicards. The
query I am interested in, will be used to update a new column in
the multicards table, containing the count of related tickets.

On production:

# EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE 
multicard_uid = multicards.uid) from multicards;

   QUERY PLAN   
 
-
 Seq Scan on multicards  (cost=0.00..1455177.30 rows=204548 width=12) (actual 
time=0.178..1694987.355 rows=204548 loops=1)
   SubPlan 1
 ->  Aggregate  (cost=7.07..7.08 rows=1 width=8) (actual time=8.283..8.283 
rows=1 loops=204548)
   ->  Index Only Scan using tickets_multicard_uid on tickets  
(cost=0.43..7.05 rows=9 width=0) (actual time=1.350..8.280 rows=6 loops=204548)
 Index Cond: (multicard_uid = multicards.uid)
 Heap Fetches: 1174940
 Planning Time: 1.220 ms
 Execution Time: 1695029.673 ms

iostat in the middle of this execution:

procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  1 479548 143652  79412 560929200  1986   116  999 1115  2  2 71 25
 3  1 479548 134312  79412 561456400  2386   128 2832 3973 20  2 49 28
 0  1 479548 142520  79428 561770800  1584   232 3440 3716 11  3 58 29
 0  1 479548 161184  79024 559775600  1922   144 1249 1562  1  2 70 27
 0  1 479548 161244  79048 560080400  1556   117 2138 3035  6  2 68 25
 0  2 479548 158384  79048 560400800  1388   402 2970 4320  6  2 66 27

On my development computer:

# EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE 
multicard_uid = multicards.uid) from multicards;

   QUERY PLAN   
 
-
 Seq Scan on multicards  (cost=0.00..538974.69 rows=204548 width=12) (actual 
time=0.055..451.691 rows=204548 loops=1)
   SubPlan 1
 ->  Aggregate  (cost=2.57..2.58 rows=1 width=8) (actual time=0.002..0.002 
rows=1 loops=204548)
   ->  Index Only Scan using tickets_multicard_uid on tickets  
(cost=0.43..2.56 rows=7 width=0) (actual time=0.001..0.001 rows=6 loops=204548)
 Index Cond: (multicard_uid = multicards.uid)
 Heap Fetches: 0
 Planning Time: 0.296 ms
 Execution Time: 456.677 ms

The execution time ratio is a huge 3700. I guess the Heap Fetches
difference is the most meaningful here; my understanding would
be, that the index would easily fit in the shared_buffers after a
few subselects, as configuration is 2GB shared_buffers and here's
the index on disk size:

# SELECT relname as index, reltuples as "rows estimate", 
pg_size_pretty(pg_table_size(quote_ident(relname))) as "on disk size" FROM 
pg_class, pg_namespace WHERE pg_namespace.oid = pg_class.relnamespace AND 
relkind = 'i' AND nspname = 'public' AND relname = 'tickets_multicard_uid';
index  | rows estimate | on disk size 
---+---+--
 tickets_multicard_uid |7.2136e+06 | 161 MB

Though it's not too clear for me what "heap fetches" are. It
seems it might be the actual table data fetches (e.g. not an
index fetch), but I don't really know why so many of them are
needed here, and how to reduce that (if that's the explanation
for the longer timing).

It's true that production has constant activity, but not so much
of it, load avg is typically about 0.5, most queries run very
quickly, pg_stat_activity frequently reports no currently running
query; here's iostat during normal production activity, for
reference:

procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 0  0 480712 151112  10144 566478411   3106900  9  1 88  3
 0  0 480712 151312  10152 566487600 0   248 4911 7072  9  3 87  0
 0  0 480720 165488  10072 56425640476   210 2019 2685  6  1 92  1
 0  0 480720 165332  

Re: much slower query in production

2020-02-26 Thread Justin Pryzby
On Wed, Feb 26, 2020 at 05:17:21PM +0100, Guillaume Cottenceau wrote:
> On production:
> 
> # EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE 
> multicard_uid = multicards.uid) from multicards;
>   
>  QUERY PLAN   
>  
> -
>  Seq Scan on multicards  (cost=0.00..1455177.30 rows=204548 width=12) (actual 
> time=0.178..1694987.355 rows=204548 loops=1)
>SubPlan 1
>  ->  Aggregate  (cost=7.07..7.08 rows=1 width=8) (actual 
> time=8.283..8.283 rows=1 loops=204548)
>->  Index Only Scan using tickets_multicard_uid on tickets  
> (cost=0.43..7.05 rows=9 width=0) (actual time=1.350..8.280 rows=6 
> loops=204548)
>  Index Cond: (multicard_uid = multicards.uid)
>  Heap Fetches: 1174940
>  Planning Time: 1.220 ms
>  Execution Time: 1695029.673 ms

> The execution time ratio is a huge 3700. I guess the Heap Fetches
> difference is the most meaningful here;

Yes, it's doing an "index only" scan, but not very effectively.
Vacuum the tickets table to set relallvisible and see if that helps.

If so, try to keep it better vacuumed with something like
ALTER TABLE tickets SET (AUTOVACUUM_VACUUM_SCALE_FACTOR=0.005);

-- 
Justin




Re: much slower query in production

2020-02-26 Thread Michael Lewis
By the way, I expect the time is cut in half while heap fetches stays
similar because the index is now in OS cache on the second run and didn't
need to be fetched from disk. Definitely need to check on vacuuming as
Justin says. If you have a fairly active system, you would need to run this
query many times in order to push other stuff out of shared_buffers and get
this query to perform more like it does on dev.

Do you have the option to re-write the query or is this generated by an
ORM? You are forcing the looping as I read this query. If you aggregate
before you join, then the system should be able to do a single scan of the
index, aggregate, then join those relatively few rows to the multicards
table records.

SELECT transaction_uid, COALESCE( sub.count, 0 ) AS count FROM multicards
LEFT JOIN (SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY
multicard_uid ) AS sub ON sub.multicard_uid = multicards.uid;


Re: much slower query in production

2020-02-26 Thread Guillaume Cottenceau
Justin Pryzby  writes:

> On Wed, Feb 26, 2020 at 05:17:21PM +0100, Guillaume Cottenceau wrote:
>> On production:
>> 
>> # EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets 
>> WHERE multicard_uid = multicards.uid) from multicards;
>>  
>>   QUERY PLAN 
>>
>> -
>>  Seq Scan on multicards  (cost=0.00..1455177.30 rows=204548 width=12) 
>> (actual time=0.178..1694987.355 rows=204548 loops=1)
>>SubPlan 1
>>  ->  Aggregate  (cost=7.07..7.08 rows=1 width=8) (actual 
>> time=8.283..8.283 rows=1 loops=204548)
>>->  Index Only Scan using tickets_multicard_uid on tickets  
>> (cost=0.43..7.05 rows=9 width=0) (actual time=1.350..8.280 rows=6 
>> loops=204548)
>>  Index Cond: (multicard_uid = multicards.uid)
>>  Heap Fetches: 1174940
>>  Planning Time: 1.220 ms
>>  Execution Time: 1695029.673 ms
>
>> The execution time ratio is a huge 3700. I guess the Heap Fetches
>> difference is the most meaningful here;
>
> Yes, it's doing an "index only" scan, but not very effectively.
> Vacuum the tickets table to set relallvisible and see if that helps.
>
> If so, try to keep it better vacuumed with something like
> ALTER TABLE tickets SET (AUTOVACUUM_VACUUM_SCALE_FACTOR=0.005);

Thanks for your reply! The effect is huge:

# vacuum analyze tickets;
VACUUM
Time: 182850.756 ms (03:02.851)

# EXPLAIN ANALYZE SELECT transaction_uid, (SELECT COUNT(*) FROM tickets WHERE 
multicard_uid = multicards.uid) from multicards;

   QUERY PLAN   
 
-
 Seq Scan on multicards  (cost=0.00..947739.65 rows=204548 width=12) (actual 
time=15.579..5210.869 rows=204548 loops=1)
   SubPlan 1
 ->  Aggregate  (cost=4.59..4.60 rows=1 width=8) (actual time=0.025..0.025 
rows=1 loops=204548)
   ->  Index Only Scan using tickets_multicard_uid on tickets  
(cost=0.43..4.57 rows=8 width=0) (actual time=0.022..0.024 rows=6 loops=204548)
 Index Cond: (multicard_uid = multicards.uid)
 Heap Fetches: 8
 Planning Time: 71.469 ms
 Execution Time: 5223.408 ms
(8 rows)

Time: 5332.361 ms (00:05.332)

(and subsequent executions are below 1 second)

It is actually consistent with using a restored backup on the dev
computer, as my understanding is this comes out without any
garbage and like a perfectly vacuumed database. Btw do you have
any hint as to how to perform timings using production data which
are consistent with production? Backup/restore is maybe not the
way to go, but rather a block device level copy?

Since postgresql 8, I have to say I rely entirely on autovacuum,
and did not notice it could really run too infrequently for the
work and create such difference. I see in documentation a default
autovacuum_vacuum_scale_factor = 0.2, is that something that is
typically lowered globally, e.g. maybe on a fairly active system?
I am worried that changing that configuration for that table to
0.005 would fix this query and similar ones, but later I might
face the same situation on other tables. Or how would you elect
tables for a lowered value configuration?

Thanks!

-- 
Guillaume Cottenceau




Re: much slower query in production

2020-02-26 Thread Guillaume Cottenceau
Michael Lewis  writes:

> By the way, I expect the time is cut in half while heap fetches stays similar 
> because the index is now in OS cache on the
> second run and didn't need to be fetched from disk. Definitely need to check 
> on vacuuming as Justin says. If you have a fairly
> active system, you would need to run this query many times in order to push 
> other stuff out of shared_buffers and get this
> query to perform more like it does on dev.
>
> Do you have the option to re-write the query or is this generated by an ORM? 
> You are forcing the looping as I read this query.
> If you aggregate before you join, then the system should be able to do a 
> single scan of the index, aggregate, then join those
> relatively few rows to the multicards table records.
>
> SELECT transaction_uid, COALESCE( sub.count, 0 ) AS count FROM multicards 
> LEFT JOIN (SELECT multicard_uid, COUNT(*) AS count
> FROM tickets GROUP BY multicard_uid ) AS sub ON sub.multicard_uid = 
> multicards.uid;

Thanks for this hint! I always hit this fact that I never write
good queries using explicit joins :/

Execution time (before vacuuming the table as adviced by Justin)
down 38x to 44509ms using this query :)

Real query was an UPDATE of the multicards table to set the count
value. I rewrote this using your approach but I think I lack what
coalesce did in your query, this would update only the rows where
count >= 1 obviously:

UPDATE multicards
   SET defacements = count
  FROM ( SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY 
multicard_uid ) AS sub
 WHERE uid = multicard_uid;

Any hinted solution to do that in one pass? I could do a first
pass setting defacements = 0, but that would produce more garbage :/

Thanks!

-- 
Guillaume Cottenceau




Re: much slower query in production

2020-02-26 Thread Michael Lewis
>
> UPDATE multicards
>SET defacements = COALESCE( count, 0 )
>   FROM ( SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY
> multicard_uid ) AS sub
>  WHERE uid = multicard_uid OR multicard_uid is null;
>

I expect this should work. Not sure of performance of course.


Re: much slower query in production

2020-02-26 Thread [email protected]
Vacuum everything that you restored

Sent from my iPhone

> On Feb 26, 2020, at 1:19 PM, Michael Lewis  wrote:
> 
> 
>> UPDATE multicards
>>SET defacements = COALESCE( count, 0 )
>>   FROM ( SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY 
>> multicard_uid ) AS sub
>>  WHERE uid = multicard_uid OR multicard_uid is null;
> 
> I expect this should work. Not sure of performance of course.


Re: much slower query in production

2020-02-26 Thread Guillaume Cottenceau
Michael Lewis  writes:

>  UPDATE multicards
>  SET defacements = COALESCE( count, 0 )
>  FROM ( SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY 
> multicard_uid ) AS sub
>  WHERE uid = multicard_uid OR multicard_uid is null;
>
> I expect this should work. Not sure of performance of course.

This looked great but as it seems you suspected, it's very slow :/
I interrupted it after 5 minutes run on my dev computer.

-- 
Guillaume Cottenceau




Re: much slower query in production

2020-02-26 Thread Jeff Janes
On Wed, Feb 26, 2020 at 11:17 AM Guillaume Cottenceau  wrote:

> Dear all,
>
> I am facing a much, much slower query in production than on my
> development computer using a restored production backup, and I
> don't understand why nor I see what I could do to speedup the
> query on production :/
>

You've already seen a VACUUM fixed it.

This must have been a logical restore (from pg_dump), not a physical
restore (from pg_basebackup for example) correct?

A physical restore should have resulted in a database in the same state of
vacuuming as its source was.  A logical restore will not, and since it is
an append-only process it will not trigger autovacuum to run, either.

If you do a logical restore, probably the first should you do afterwards is
a VACUUM ANALYZE.

Cheers,

Jeff


Re: much slower query in production

2020-02-26 Thread Jeff Janes
On Wed, Feb 26, 2020 at 1:02 PM Guillaume Cottenceau  wrote:

>
> It is actually consistent with using a restored backup on the dev
> computer, as my understanding is this comes out without any
> garbage and like a perfectly vacuumed database.


I think I got that backwards in my previous email.  It is the dev that was
restored, not the prod?  But unless you went out of your way to vacuum dev,
it would not be perfectly vacuumed.  If it were a logical restore, it would
be perfectly unvacuumed, and if a physical restore would be in the same
state of vacuuming as the database it was cloned from.


> Btw do you have
> any hint as to how to perform timings using production data which
> are consistent with production? Backup/restore is maybe not the
> way to go, but rather a block device level copy?
>

block device copy seems like overkill, just using pg_basebackup should be
good enough.


>
> Since postgresql 8, I have to say I rely entirely on autovacuum,
> and did not notice it could really run too infrequently for the
> work and create such difference. I see in documentation a default
> autovacuum_vacuum_scale_factor = 0.2, is that something that is
> typically lowered globally, e.g. maybe on a fairly active system?
> I am worried that changing that configuration for that table to
> 0.005 would fix this query and similar ones, but later I might
> face the same situation on other tables. Or how would you elect
> tables for a lowered value configuration?
>

The autovacuum system has never been redesigned with the needs of
index-only-scans in mind.  If I have a table for which index-only scans are
important, I'd set  autovacuum_vacuum_scale_factor = 0 and
set autovacuum_vacuum_threshold to about 5% of the number of blocks in the
table.  There is no syntax to say '5% of the number of blocks in the table'
so you have to compute it yourself and hardcode the result, which makes it
unsuitable for a global setting.  And this still only addresses UPDATE and
DELETE operations, not INSERTs.  If you have INSERT only or mostly table
for which index-only-scans are important, you might need to set up cron
jobs to do vacuuming.

Cheers,

Jeff