Re: High COMMIT times

2021-01-11 Thread Craig Jackson
How far apart are the min/max connection settings on your application
connection pool? We had a similar issue with connection storms in the past
on Oracle. One thing we did to minimize the storms was make sure there was
not a wide gap between the min/max, say no more than a 5-10 connection
difference between min/max.

Regards,

Craig

On Sun, Jan 10, 2021 at 5:42 PM Don Seiler  wrote:

> On Sat, Jan 9, 2021 at 2:07 PM Jeff Janes  wrote:
>
>>
>> How are you monitoring the COMMIT times?  What do you generally see in
>> pg_stat_activity.wait_event during the spikes/stalls?
>>
>
> Right now we just observe the COMMIT duration posted in the postgresql log
> (we log anything over 100ms).
>
> One other thing that I shamefully forgot to mention. When we see these
> slow COMMITs in the log, they coincide with a connection storm (Cat 5
> hurricane) from our apps where connections will go from ~200 to ~1200. This
> will probably disgust many, but our PG server's max_connections is set to
> 2000. We have a set of pgbouncers in front of this with a total
> max_db_connections of 1600. I know many of you think this defeats the whole
> purpose of having pgbouncer and I agree. I've been trying to explain as
> much and that even with 32 CPUs on this DB host, we probably shouldn't
> expect to be able to support more than 100-200 active connections, let
> alone 1600. I'm still pushing to have our app server instances (which also
> use their own JDBC (Hikari) connection pool and *then* go through
> pgbouncer) to lower their min/max connection settings but obviously it's
> sort of counterintuitive at first glance but hopefully everyone sees the
> bigger picture.
>
> One nagging question I have is if the slow COMMIT is triggering the
> connection storm (eg app sees slow response or timeout from a current
> connection and fires off a new connection in its place), or vice-versa.
> We're planning to deploy new performant cloud storage (Azure Ultra disk)
> just for WAL logs but I'm hesitant to say it'll be a silver bullet when we
> still have this insane connection management strategy in place.
>
> Curious to know what others think (please pull no punches) and if others
> have been in a similar scenario with anecdotes to share.
>
> Thanks,
> Don.
>
> --
> Don Seiler
> www.seiler.us
>


-- 
Craig

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: High COMMIT times

2021-01-11 Thread Don Seiler
On Mon, Jan 11, 2021 at 9:06 AM Craig Jackson 
wrote:

> How far apart are the min/max connection settings on your application
> connection pool? We had a similar issue with connection storms in the past
> on Oracle. One thing we did to minimize the storms was make sure there was
> not a wide gap between the min/max, say no more than a 5-10 connection
> difference between min/max.
>

App instances by default are configured for 10 connections in their Hikari
connection pool, with no different setting for min so it's always 10. Some
have begun setting a minimumIdle to 5. I'm pushing them now to lower their
minimumIdle to something like 2, and if they are scaled out to multiple app
instances then they should think about lowering their max from 10 to 5
perhaps.

>From a recent spike from this morning (names have been changed, but other
data is real):

>From this morning's spike. At 06:03:15, service foo had 2 active sessions
> and 127 idle sessions. At 06:03:30 (the next tick in grafana), it had 5
> active sessions but 364 idle sessions. The foo_db DB overall had 9 active
> and 337 idle at 06:03:15, and then 5 active and 788 idle overall in the
> next tick. So a flood of new connections were created within that 15 second
> interval (probably all within the same second) and more or less abandoned.


Don.
-- 
Don Seiler
www.seiler.us


How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Rémi Chatenay
Hi,

Thanks in advance for your help. I'm putting as much context and details as
possible, but let me know if you have any questions.

What?

We are experiencing some slow queries due to the query planner using an
incorrect index. It is using an unoptimized index because the stats are
computed during the night when the data is not the same as during the day.

Context

We have a table conversations like that

|id|status|user_id|

and 2 indexes:

CREATE INDEX index_conversations_on_user_id_and_status ON
public.conversations USING btree (user_id, status);

CREATE INDEX index_conversations_on_status ON public.conversations USING
btree (status)

The slow query is the following:

SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123

We expect the query planner to use the
index_conversations_on_user_id_and_status but it sometimes uses the other
one.

What's happening ?

There are hundreds of conversations with a status 'in_progress' at a given
time during the day but virtually none during the night.

So when the analyze is run during the night, PG then thinks that using the
index_conversations_on_status will return almost no rows and so it uses
this index instead of the combined one.

When the analyze is run during the day, PG correctly uses the right index
(index_conversations_on_user_id_and_status)

[With an analyze run during the day]

Limit (cost=0.43..8.45 rows=1 width=8) (actual time=1.666..1.666 rows=0
loops=1)

-> Index Scan using index_conversations_on_user_id_and_status on
conversations (cost=0.43..8.45 rows=1 width=8) (actual_time=1.665..1.665
rows:0 loops:1)

Index Cond: ((user_id = 123) AND ((status)::text = 'in_progress'::text))

Filter: (id <> 1)

Planning Time: 8.642 ms

Execution Time: 1.693 ms

[With an analyze run during the night]

Limit (cost=0.43..8.46 rows=1 width=8) (actual time=272.812..272.812 rows=0
loops=1)

-> Index Scan using index_conversations_on_status on conversations
(cost=0.43..8.46 rows=1 width=8) (actual_time=272.812..272.812 rows:0
loops:1)

Index Cond: ((status)::text = 'in_progress'::text))

Filter: (id <> 1) AND (user_id = 123)

Rows Removed by Filter: 559

Planning Time: 0.133 ms

Execution Time: 272.886 ms

The question

We currently run a manual weekly vacuum analyze during the night. I'm
wondering what are our possible solutions. One is to manually run the
analyze during the day but is there a way to tell PG to run the auto
analyze at a given time of the day for example ? I guess we are not the
first ones to have data patterns that differ between when the analyze is
run and the query is run.

Config
Postgres version: 11Table Metadata

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts,
relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE
relname='conversations';

  relname   | relpages | reltuples  | relallvisible | relkind | relnatts |
relhassubclass | reloptions | pg_table_size

-+--++---+-+--+++---

conversations |   930265 | 7.3366e+06 |902732 | r   |   16
| f  ||7622991872
Maintenance Setup

We have manual vacuum analyze every week during the night.

GUC Settings
Unsure what's necessary...

"autovacuum_analyze_threshold"  = "50"
"autovacuum_max_workers"= "3",
"autovacuum_naptime"= "60"
"autovacuum_vacuum_threshold"   = "50"


Statistics: n_distinct, MCV, histogram

SELECT (SELECT sum(x) FROM unnest(most_common_freqs) x) frac_MCV,
tablename, attname, inherited, null_frac, n_distinct,
array_length(most_common_vals,1) n_mcv, array_length(histogram_bounds,1)
n_hist, correlation FROM pg_stats WHERE attname='status' AND
tablename='conversations' ORDER BY 1 DESC;

frac_mcv |  tablename  | attname | inherited | null_frac | n_distinct |
n_mcv | n_hist | correlation

--+-+-+---+---++---++-

0.67 | conversations | status  | f | 0 |  6 |
  5 ||0.967121


Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Justin Pryzby
On Mon, Jan 11, 2021 at 04:50:12PM +0100, Rémi Chatenay wrote:
> We are experiencing some slow queries due to the query planner using an
> incorrect index. It is using an unoptimized index because the stats are
> computed during the night when the data is not the same as during the day.
> 
> CREATE INDEX index_conversations_on_user_id_and_status ON
> public.conversations USING btree (user_id, status);
> 
> CREATE INDEX index_conversations_on_status ON public.conversations USING
> btree (status)
> 
> The slow query is the following:
> 
> SELECT id FROM conversations WHERE status = 'in_progress' AND user_id = 123
> 
> There are hundreds of conversations with a status 'in_progress' at a given
> time during the day but virtually none during the night.
> 
> So when the analyze is run during the night, PG then thinks that using the
> index_conversations_on_status will return almost no rows and so it uses
> this index instead of the combined one.
> 
> When the analyze is run during the day, PG correctly uses the right index
> (index_conversations_on_user_id_and_status)

> We currently run a manual weekly vacuum analyze during the night. I'm
> wondering what are our possible solutions. One is to manually run the
> analyze during the day but is there a way to tell PG to run the auto
> analyze at a given time of the day for example ? I guess we are not the
> first ones to have data patterns that differ between when the analyze is
> run and the query is run.

I think you could run manual ANALYZE during the day just for this one column:
  ANALYZE conversations (status);

If it takes too long or causes a performance issue, you could do:
  SET default_statistics_target=10;
  ANALYZE conversations (status);

You could also change to make autovacuum do this on its own, by setting:
 ALTER TABLE conversations SET (autovacuum_analyze_scale_factor=0.005);

If that works but too slow, then maybe ALTER TABLE .. SET STATISTICS 10.

-- 
Justin




Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Adam Brusselback
I'd personally bake an analyze call on that table (column) into whatever
job is responsible for changing the state of the table that much, if it's
possible to do it as a last step.


Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Michael Lewis
What is the usage pattern of the conversations table? Is getting many
inserts during the day, or updates of status mostly?

Why have an index on the status column at all? My guess would be that there
are 2-10 statuses, but many many rows in the table for most of those
statuses. Having a low cardinality index that changes frequently seems
prone to mis-use by the system.


Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Rémi Chatenay
I'd say it's a 1 insert for 5 - 10 updates.

As for the index on the status, it's because we have a job that runs every
night that deals with conversations in specific statuses. Having a low
cardinality index that changes frequently seems prone to mis-use by the
system.  -> What would be an alternative ?

On Mon, Jan 11, 2021 at 5:48 PM Michael Lewis  wrote:

> What is the usage pattern of the conversations table? Is getting many
> inserts during the day, or updates of status mostly?
>
> Why have an index on the status column at all? My guess would be that
> there are 2-10 statuses, but many many rows in the table for most of those
> statuses. Having a low cardinality index that changes frequently seems
> prone to mis-use by the system.
>


Re: How to deal with analyze gathering irrelevant stats

2021-01-11 Thread Michael Lewis
On Mon, Jan 11, 2021 at 9:52 AM Rémi Chatenay 
wrote:

> I'd say it's a 1 insert for 5 - 10 updates.
>
> As for the index on the status, it's because we have a job that runs every
> night that deals with conversations in specific statuses. Having a low
> cardinality index that changes frequently seems prone to mis-use by the
> system.  -> What would be an alternative ?
>

One option would be a partial index on another field used in that query *where
status in ( list_of_uncommon_statuses_queried_nightly )*

Sequential scan may be perfectly fine for a nightly script though.