Re: Advice needed: query performance deteriorates by 2000% within 1 minute

2022-03-01 Thread Michael Lewis
If you expect to have high cache hits and/or have ssd or similar fast
storage, random page cost should be more like 1-2 rather than the default
4. When using jsonb, you'd normally have estimates based solely on the
constants for the associated datatype (1/3 or 2/3 for a nullable boolean
for instance, and I think half a percent for an int column) but because you
are using an index on a function, you should be getting custom stats
related to that. They just don't seem to be helping you a ton.

With gin indexes, there is also the pending list to consider. I haven't had
to deal with that much, but just know of it from the documentation.


Re: Advice needed: query performance deteriorates by 2000% within 1 minute

2022-03-01 Thread Michael Lewis
And I would absolutely crank up autovacuum and analyze settings. Turn up
the cost limits, turn down the cost delays, decrease the scale factor.
Whatever you need to do such that autovacuum runs often. No need to
schedule a manual vacuum at all. Just don't wait until 20% of the table is
dead before an autovacuum is triggered like the default behavior. The cost
to gather new stats and do garbage collection is rather minimal compared to
the benefit to queries that rely on the data in many cases.


Re: constraint exclusion with ineq condition (Re: server hardware tuning.)

2019-02-14 Thread Michael Lewis
What are these two tables partitioned by?

On Thu, Feb 14, 2019, 5:03 AM suganthi Sekar  Hi,
>
> Thanks, i know if explicitly we give in where condition it is working.
>
> i thought with below parameter in Postgresq11 this issue is fixed ?
>
> * enable_partitionwise_join  to 'on';*
>
> * what is the use of enable_partitionwise_join  to 'on';*
>
> *Thanks for your response.*
>
> *Regards*
> *Suganthi Sekar*
> --
> *From:* Justin Pryzby 
> *Sent:* 14 February 2019 16:10:01
> *To:* suganthi Sekar
> *Cc:* [email protected]
> *Subject:* Re: constraint exclusion with ineq condition (Re: server
> hardware tuning.)
>
> On Thu, Feb 14, 2019 at 10:38:36AM +, suganthi Sekar wrote:
> > u mean the below parameter need to set on . its already on only.
> >   alter system set  constraint_exclusion  to 'on';
>
> No, I said:
> > You can work around it by specifying the same condition on
> b.call_created_date:
> > >  AND b.call_created_date >='2017-11-01' AND
> b.call_created_date<'2017-11-30'
>


Re: autovacuum big table taking hours and sometimes seconds

2019-02-14 Thread Michael Lewis
It is curious to me that the tuples remaining count varies so wildly. Is
this expected?


*Michael Lewis*

On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <
[email protected]> wrote:

> I checked in the logs when the autovacuum vacuum my big toasted table
> during the week and I wanted to confirm with you what I think :
> postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
> postgresql-Fri.log- pages: 2253 removed, 13737828 remain
> postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
> postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses,
> 19274530 dirtied
> postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469
> MiB/s
> --
> postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
> postgresql-Mon.log- pages: 0 removed, 23176876 remain
> postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
> postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses,
> 38950869 dirtied
> postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396
> MiB/s
> --
> postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
> postgresql-Mon.log- pages: 0 removed, 23176876 remain
> postgresql-Mon.log- tuples: 114573 removed, 57785 remain
> postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses,
> 15626466 dirtied
> postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470
> MiB/s
> --
> postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
> postgresql-Sat.log- pages: 0 removed, 13737828 remain
> postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
> postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses,
> 22473776 dirtied
> postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410
> MiB/s
> --
> postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
> postgresql-Thu.log- pages: 0 removed, 10290976 remain
> postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
> postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses,
> 19232835 dirtied
> postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437
> MiB/s
> --
> postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
> postgresql-Tue.log- pages: 0 removed, 23176876 remain
> postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
> postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses,
> 25472137 dirtied
> postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385
> MiB/s
> --
>
>
> Lets focus for example on one of the outputs :
> postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic vacuum
> of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
> postgresql-Fri.log- pages: 2253 removed, 13737828 remain
> postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
> postgresql-Fri.log- buffer usage: *15031267* hits, *21081633 *misses, 
> *19274530
> *dirtied
> postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469
> MiB/s
>
> The cost_limit is set to 200 (default) and the cost_delay is set to 20ms.
> The calculation I did : (1**15031267*+10**21081633*+20**19274530)*/200*20/1000
> = 61133.8197 seconds ~ 17H
> So autovacuum was laying down for 17h ? I think that I should increase the
> cost_limit to max specifically on the toasted table. What do you think ? Am
> I wrong here ?
>
>
> ‫בתאריך יום ה׳, 7 בפבר׳ 2019 ב-18:26 מאת ‪Jeff Janes‬‏ <‪
> [email protected]‬‏>:‬
>
>> On Thu, Feb 7, 2019 at 6:55 AM Mariel Cherkassky <
>> [email protected]> wrote:
>>
>> I have 3 questions :
>>> 1)To what value do you recommend to increase the vacuum cost_limit ?
>>> 2000 seems reasonable ? Or maybe its better to leave it as default and
>>> assign a specific value for big tables ?
>>>
>>
>> That depends on your IO hardware, and your workload.  You wouldn't want
>> background vacuum to use so much of your available IO that it starves your
>> other processes.
>>
>>
>>
>>> 2)When the autovacuum reaches the cost_limit while trying to vacuum a
>>> specific table, it wait nap_time seconds and then it continue to work on
>>> the same table ?
>>>

Re: constraint exclusion with ineq condition (Re: server hardware tuning.)

2019-02-14 Thread Michael Lewis
Yeah, the planner doesn't know that call_created_date can be limited on
both tables unless you tell it specify it in the where condition as Laurenz
said on another thread.


*Michael Lewis*

On Thu, Feb 14, 2019 at 7:35 AM suganthi Sekar 
wrote:

> Both table Portion by  same column call_created_date
> ____
> From: Michael Lewis 
> Sent: 14 February 2019 19:35:48
> To: suganthi Sekar
> Cc: Justin Pryzby; [email protected]
> Subject: Re: constraint exclusion with ineq condition (Re: server hardware
> tuning.)
>
> What are these two tables partitioned by?
>
> On Thu, Feb 14, 2019, 5:03 AM suganthi Sekar  <mailto:[email protected]> wrote:
> Hi,
>
> Thanks, i know if explicitly we give in where condition it is working.
>
> i thought with below parameter in Postgresq11 this issue is fixed ?
>
>  enable_partitionwise_join  to 'on';
>
>  what is the use of enable_partitionwise_join  to 'on';
>
> Thanks for your response.
>
> Regards
> Suganthi Sekar
> 
> From: Justin Pryzby mailto:[email protected]>>
> Sent: 14 February 2019 16:10:01
> To: suganthi Sekar
> Cc: [email protected] [email protected]>
> Subject: Re: constraint exclusion with ineq condition (Re: server hardware
> tuning.)
>
> On Thu, Feb 14, 2019 at 10:38:36AM +, suganthi Sekar wrote:
> > u mean the below parameter need to set on . its already on only.
> >   alter system set  constraint_exclusion  to 'on';
>
> No, I said:
> > You can work around it by specifying the same condition on
> b.call_created_date:
> > >  AND b.call_created_date >='2017-11-01' AND
> b.call_created_date<'2017-11-30'
>


Re: autovacuum big table taking hours and sometimes seconds

2019-02-14 Thread Michael Lewis
Thanks, that context is very enlightening. Do you manually vacuum after
doing the big purge of old session data? Is bloat causing issues for you?
Why is it a concern that autovacuum's behavior varies?


*Michael Lewis*

On Thu, Feb 14, 2019 at 12:41 PM Mariel Cherkassky <
[email protected]> wrote:

> Maybe by explaining the tables purpose it will be cleaner. The original
> table contains rows for sessions in my app. Every session saves for itself
> some raw data which is saved in the toasted table. We clean old sessions
> (3+ days) every night. During the day sessions are created so the size of
> the table should grow during the day and freed in the night after the
> autovacuum run.However, the autovacuums sleeps for alot of time and during
> that time more sessions are created so maybe this can explain the big size
> ? Do you think that by increasing the cost limit and decreasing the cost
> delay I can solve the issue ?
>
> On Thu, Feb 14, 2019, 8:38 PM Michael Lewis 
>> It is curious to me that the tuples remaining count varies so wildly. Is
>> this expected?
>>
>>
>> *Michael Lewis*
>>
>> On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <
>> [email protected]> wrote:
>>
>>> I checked in the logs when the autovacuum vacuum my big toasted table
>>> during the week and I wanted to confirm with you what I think :
>>> postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic
>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
>>> postgresql-Fri.log- pages: 2253 removed, 13737828 remain
>>> postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
>>> postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses,
>>> 19274530 dirtied
>>> postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469
>>> MiB/s
>>> --
>>> postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic vacuum
>>> of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
>>> postgresql-Mon.log- pages: 0 removed, 23176876 remain
>>> postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
>>> postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses,
>>> 38950869 dirtied
>>> postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396
>>> MiB/s
>>> --
>>> postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic
>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
>>> postgresql-Mon.log- pages: 0 removed, 23176876 remain
>>> postgresql-Mon.log- tuples: 114573 removed, 57785 remain
>>> postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses,
>>> 15626466 dirtied
>>> postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470
>>> MiB/s
>>> --
>>> postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  automatic vacuum
>>> of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
>>> postgresql-Sat.log- pages: 0 removed, 13737828 remain
>>> postgresql-Sat.log- tuples: 34457593 removed, 15871942 remain
>>> postgresql-Sat.log- buffer usage: 15552642 hits, 26130334 misses,
>>> 22473776 dirtied
>>> postgresql-Sat.log- avg read rate: 2.802 MiB/s, avg write rate: 2.410
>>> MiB/s
>>> --
>>> postgresql-Thu.log:2019-02-07 12:08:50 EST  29630  LOG:  automatic
>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 13
>>> postgresql-Thu.log- pages: 0 removed, 10290976 remain
>>> postgresql-Thu.log- tuples: 35357057 removed, 3436237 remain
>>> postgresql-Thu.log- buffer usage: 11854053 hits, 21346342 misses,
>>> 19232835 dirtied
>>> postgresql-Thu.log- avg read rate: 2.705 MiB/s, avg write rate: 2.437
>>> MiB/s
>>> --
>>> postgresql-Tue.log:2019-02-12 20:54:44 EST  21464  LOG:  automatic
>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 10
>>> postgresql-Tue.log- pages: 0 removed, 23176876 remain
>>> postgresql-Tue.log- tuples: 26011446 removed, 49426774 remain
>>> postgresql-Tue.log- buffer usage: 21863057 hits, 28668178 misses,
>>> 25472137 dirtied
>>> postgresql-Tue.log- avg read rate: 2.684 MiB/s, avg write rate: 2.385
>>> MiB/s
>>> --
>>>
>>>
>>> Lets focus for example on one of the outputs :
>>> postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic
>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
>>> postgresql-Fri.log- pages:

Re: autovacuum big table taking hours and sometimes seconds

2019-02-14 Thread Michael Lewis
If there are high number of updates during normal daytime processes, then
yes you need to ensure autovacuum is handling this table as needed. If the
nightly delete is the only major source of bloat on this table, then
perhaps running a manual vacuum keeps things tidy after the big delete.
Granted, if you are manually going to vacuum, don't use vacuum full as
there is not much sense in recovering that disk space if the table is going
to expected to be similarly sized again by the end of the day.

Do you have a proper number of workers and maintenance_work_mem to get the
job done?

As you proposed, it seems likely to be good to significantly increase
autovacuum_vacuum_cost_limit on this table, and perhaps decrease
autovacuum_vacuum_scale_factor if it is not being picked up as a candidate
for vacuum very frequently.



*Michael Lewis *


On Thu, Feb 14, 2019 at 1:08 PM Mariel Cherkassky <
[email protected]> wrote:

> No I don't run vacuum manually afterwards because the autovacuum should
> run. This process happens every night. Yes , bloating is an issue because
> the table grow and take a lot of space on disk.  Regarding the autovacuum,
> I think that it sleeps too much time (17h) during it's work, don't you
> think so?
>
> On Thu, Feb 14, 2019, 9:52 PM Michael Lewis 
>> Thanks, that context is very enlightening. Do you manually vacuum after
>> doing the big purge of old session data? Is bloat causing issues for you?
>> Why is it a concern that autovacuum's behavior varies?
>>
>>
>> *Michael Lewis*
>>
>> On Thu, Feb 14, 2019 at 12:41 PM Mariel Cherkassky <
>> [email protected]> wrote:
>>
>>> Maybe by explaining the tables purpose it will be cleaner. The original
>>> table contains rows for sessions in my app. Every session saves for itself
>>> some raw data which is saved in the toasted table. We clean old sessions
>>> (3+ days) every night. During the day sessions are created so the size of
>>> the table should grow during the day and freed in the night after the
>>> autovacuum run.However, the autovacuums sleeps for alot of time and during
>>> that time more sessions are created so maybe this can explain the big size
>>> ? Do you think that by increasing the cost limit and decreasing the cost
>>> delay I can solve the issue ?
>>>
>>> On Thu, Feb 14, 2019, 8:38 PM Michael Lewis >>
>>>> It is curious to me that the tuples remaining count varies so wildly.
>>>> Is this expected?
>>>>
>>>>
>>>> *Michael Lewis*
>>>>
>>>> On Thu, Feb 14, 2019 at 9:09 AM Mariel Cherkassky <
>>>> [email protected]> wrote:
>>>>
>>>>> I checked in the logs when the autovacuum vacuum my big toasted table
>>>>> during the week and I wanted to confirm with you what I think :
>>>>> postgresql-Fri.log:2019-02-08 05:05:53 EST  24776  LOG:  automatic
>>>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 8
>>>>> postgresql-Fri.log- pages: 2253 removed, 13737828 remain
>>>>> postgresql-Fri.log- tuples: 21759258 removed, 27324090 remain
>>>>> postgresql-Fri.log- buffer usage: 15031267 hits, 21081633 misses,
>>>>> 19274530 dirtied
>>>>> postgresql-Fri.log- avg read rate: 2.700 MiB/s, avg write rate: 2.469
>>>>> MiB/s
>>>>> --
>>>>> postgresql-Mon.log:2019-02-11 01:11:46 EST  8426  LOG:  automatic
>>>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 23
>>>>> postgresql-Mon.log- pages: 0 removed, 23176876 remain
>>>>> postgresql-Mon.log- tuples: 62269200 removed, 82958 remain
>>>>> postgresql-Mon.log- buffer usage: 28290538 hits, 46323736 misses,
>>>>> 38950869 dirtied
>>>>> postgresql-Mon.log- avg read rate: 2.850 MiB/s, avg write rate: 2.396
>>>>> MiB/s
>>>>> --
>>>>> postgresql-Mon.log:2019-02-11 21:43:19 EST  24323  LOG:  automatic
>>>>> vacuum of table "myDB.pg_toast.pg_toast_1958391": index scans: 1
>>>>> postgresql-Mon.log- pages: 0 removed, 23176876 remain
>>>>> postgresql-Mon.log- tuples: 114573 removed, 57785 remain
>>>>> postgresql-Mon.log- buffer usage: 15877931 hits, 15972119 misses,
>>>>> 15626466 dirtied
>>>>> postgresql-Mon.log- avg read rate: 2.525 MiB/s, avg write rate: 2.470
>>>>> MiB/s
>>>>> --
>>>>> postgresql-Sat.log:2019-02-09 04:54:50 EST  1793  LOG:  auto

Re: index on jsonb col with 2D array inside the json

2019-02-19 Thread Michael Lewis
Is your JSON data getting toasted? I wouldn't assume so if it is remaining
small but something to check. Regardless, if an index exists and isn't
being used, then that would be the primary concern. You didn't share what
the definition of the index on R.data is... what do you already have?

You have an array of ranges stored as the value of key "ranges" in jsonb
field data. If you created a table like R2, but with a single "range"
column that is int4range type, then I would expect that you could add a
GiST and then use overlaps &&, or another operator. I would not expect that
you could index (unnest data->>'ranges' for instance) to get the separated
out range values.



*Michael Lewis *


On Tue, Feb 19, 2019 at 8:59 AM Mariel Cherkassky <
[email protected]> wrote:

> Hi,
> I have a table with json col : R(object int, data jsonb).
> Example for content :
>  object | data
> +---
>  50 | {"ranges": [[1, 1]]}
>  51 | {"ranges": [[5, 700],[1,5],[9,10}
>  52 | {"ranges": [[4, 200],[2,4],[3,4]]}
>  53 | {"ranges": [[2, 2]]}
>  54 | {"ranges": [[5, 10]]}
>
> Now I tried to query for all the objects that contains a specific range,
> for example [2,2] :
> explain analyze SELECT *
> FROM   R d
> WHERE  EXISTS (
>SELECT FROM jsonb_array_elements(R.data -> 'ranges') rng
>WHERE (rng->>0)::bigint <= 2 and (rng->>1)::bigint >= 2
>);
>
> I saw that the gin index isnt suitable for this type of comparison.
> However, I saw that the gist index is suitable to handle ranges. Any idea
> of I can implement a gist index here ?
>
> In addition, I saved the same data in relational table
> R2(object,range_first,range_last).
>  The previous data in this format :
> object   range_first   range_last
> 50  1  1
> 51  5  700
> 51  15
> 51  9 10
>
> i compared the first query with :
>   explain analyze  select * from R2 where  range_first   <=2 and
> range_last >= 2; (I have an index on range_first,range_last that is used)
>
> The query on the jsonb column was 100x slower (700 m/s vs 7m/s). The
> question is, Am I missing an index or the jsonb datatype isnt suitable for
> this structure of data. The R2 table contains 500K records while the R
> table contains about 200K records.
>
>
>
>


Re: index on jsonb col with 2D array inside the json

2019-02-19 Thread Michael Lewis
I would expect that R2 vs R3 would be negligible but perhaps gist works
much better and would be an improvement. When you are down to 7ms already,
I wouldn't hope for any big change. I assume you used btree for the
multi-column index on R2 range_first, range_last but am not familiar with
gist on range vs btree on two int columns.

It seems a little odd to have a jsonb value to hold multiple range values.
A range is already a complex type so separating out into the association
table like R3 would make sense to me.

*Michael Lewis*

On Tue, Feb 19, 2019 at 9:34 AM Mariel Cherkassky <
[email protected]> wrote:

> I dont have any indexes on R (the table with the jsonb column). I was
> asking if I can create any that can increase this query`s performance.
> If I understood you correctly I have  3 options right now :
> 1)R, without indexes
> 2)R2 with an index on first and last
> 3)R3 that should contain a single range column (type int4range) with gist
> index on it.
>
> In aspect of performance, R
> ‫בתאריך יום ג׳, 19 בפבר׳ 2019 ב-18:28 מאת ‪Michael Lewis‬‏ <‪
> [email protected]‬‏>:‬
>
>> Is your JSON data getting toasted? I wouldn't assume so if it is
>> remaining small but something to check. Regardless, if an index exists and
>> isn't being used, then that would be the primary concern. You didn't share
>> what the definition of the index on R.data is... what do you already have?
>>
>> You have an array of ranges stored as the value of key "ranges" in jsonb
>> field data. If you created a table like R2, but with a single "range"
>> column that is int4range type, then I would expect that you could add a
>> GiST and then use overlaps &&, or another operator. I would not expect that
>> you could index (unnest data->>'ranges' for instance) to get the separated
>> out range values.
>>
>>
>>
>> *Michael Lewis *
>>
>>
>> On Tue, Feb 19, 2019 at 8:59 AM Mariel Cherkassky <
>> [email protected]> wrote:
>>
>>> Hi,
>>> I have a table with json col : R(object int, data jsonb).
>>> Example for content :
>>>  object | data
>>> +---
>>>  50 | {"ranges": [[1, 1]]}
>>>  51 | {"ranges": [[5, 700],[1,5],[9,10}
>>>  52 | {"ranges": [[4, 200],[2,4],[3,4]]}
>>>  53 | {"ranges": [[2, 2]]}
>>>  54 | {"ranges": [[5, 10]]}
>>>
>>> Now I tried to query for all the objects that contains a specific range,
>>> for example [2,2] :
>>> explain analyze SELECT *
>>> FROM   R d
>>> WHERE  EXISTS (
>>>SELECT FROM jsonb_array_elements(R.data -> 'ranges') rng
>>>WHERE (rng->>0)::bigint <= 2 and (rng->>1)::bigint >= 2
>>>);
>>>
>>> I saw that the gin index isnt suitable for this type of comparison.
>>> However, I saw that the gist index is suitable to handle ranges. Any idea
>>> of I can implement a gist index here ?
>>>
>>> In addition, I saved the same data in relational table
>>> R2(object,range_first,range_last).
>>>  The previous data in this format :
>>> object   range_first   range_last
>>> 50  1  1
>>> 51  5  700
>>> 51  15
>>> 51  9 10
>>>
>>> i compared the first query with :
>>>   explain analyze  select * from R2 where  range_first   <=2 and
>>> range_last >= 2; (I have an index on range_first,range_last that is used)
>>>
>>> The query on the jsonb column was 100x slower (700 m/s vs 7m/s). The
>>> question is, Am I missing an index or the jsonb datatype isnt suitable for
>>> this structure of data. The R2 table contains 500K records while the R
>>> table contains about 200K records.
>>>
>>>
>>>
>>>


Re: Why isn't an index scan being used?

2019-02-19 Thread Michael Lewis
On Tue, Feb 19, 2019, 8:00 PM Andrew Gierth  > "Abi" == Abi Noda  writes:
>
>  Abi> However, when I index the closed column, a bitmap scan is used
>  Abi> instead of an index scan, with slightly slower performance. Why
>  Abi> isn't an index scan being used, given that the exact same number
>  Abi> of rows are at play as in my query on the state column?
>
> Most likely difference is the correlation estimate for the conditions.
> The cost of an index scan includes a factor based on how well correlated
> the physical position of rows is with the index order, because this
> affects the number of random seeks in the scan. But for nulls this
> estimate cannot be performed, and bitmapscan is cheaper than plain
> indexscan on poorly correlated data.
>

Does this imply that the optimizer would always prefer the bitmapscan
rather than index scan even if random page cost = 1, aka sequential cost,
when the correlation is unknown like a null? Or only when it thinks random
access is more expensive by some significant factor?


> --
> Andrew (irc:RhodiumToad)
>
>


Re: Aggregate and many LEFT JOIN

2019-02-22 Thread Michael Lewis
Curious- Is geqo_threshold still set to 12? Is increasing
join_collapse_limit to be higher than geqo_threshold going to have a
noticeable impact?

The disk sorts are the killer as Justin says. I wonder how it performs with
that increased significantly. Is the storage SSD or traditional hard disks?

*Michael Lewis*

On Fri, Feb 22, 2019 at 8:54 AM Justin Pryzby  wrote:

> On Fri, Feb 22, 2019 at 04:14:05PM +0100, kimaidou wrote:
> > Explain shows that the GROUP AGGREGATE and needed sort kill the
> performance.
> > Do you have any hint how to optimize this ?
> > https://explain.depesz.com/s/6nf
>
> This is writing 2GB tempfile, perhaps the query would benefit from larger
> work_mem:
> |Sort (cost=3,014,498.66..3,016,923.15 rows=969,796 width=1,818) (actual
> time=21,745.193..22,446.561 rows=1,212,419 loops=1)
> |   Sort Method: external sort Disk: 1782200kB
> |   Buffers: shared hit=5882951, temp read=230958 written=230958
>
> This is apparently joining without indices:
> |Nested Loop Left Join (cost=1.76..360,977.37 rows=321,583 width=1,404)
> (actual time=0.080..1,953.007 rows=321,849 loops=1)
> |   Join Filter: (tgc1.groupe_nom = t.group1_inpn)
> |   Rows Removed by Join Filter: 965547
> |   Buffers: shared hit=1486327
>
> This perhaps should have an index on tgc2.groupe_type ?
> |Index Scan using t_group_categorie_pkey on taxon.t_group_categorie tgc2
> (cost=0.14..0.42 rows=1 width=28) (actual time=0.002..0.002 rows=1
> loops=321,849)
> |   Index Cond: (tgc2.groupe_nom = t.group2_inpn)
> |   Filter: (tgc2.groupe_type = 'group2_inpn'::text)
> |   Buffers: shared hit=643687
>
> This would perhaps benefit from an index on tv.cd_ref ?
> |Index Scan using taxref_consolide_non_filtre_cd_nom_idx on
> taxon.taxref_consolide_non_filtre tv (cost=0.42..0.63 rows=1 width=94)
> (actual time=0.002..0.002 rows=1 loops=690,785)
> |   Index Cond: (tv.cd_nom = t.cd_ref)
> |   Filter: (tv.cd_nom = tv.cd_ref)
> |   Buffers: shared hit=2764875
>
> I don't think it's causing a significant fraction of the issue, but for
> some
> reason this is overestimating rowcount by 2000.  Do you need to VACUUM
> ANALYZE
> the table ?
> |Seq Scan on occtax.personne p_1 (cost=0.00..78.04 ROWS=2,204 width=56)
> (actual time=0.011..0.011 ROWS=1 loops=1)
>
> Justin
>
>


Re: Aggregate and many LEFT JOIN

2019-02-22 Thread Michael Lewis
Does the plan change significantly with this-

set session work_mem='250MB';
set session geqo_threshold = 20;
set session join_collapse_limit = 20;

With that expensive sort spilling to disk and then aggregating after that,
it would seem like the work_mem being significantly increased is going to
make the critical difference. Unless it could fetch the data sorted via an
index, but that doesn't seem likely.

I would suggest increase default_statistics_target, but you have good
estimates already for the most part. Hopefully someone else will chime in
with more.

*Michael Lewis*


Re: Aggregate and many LEFT JOIN

2019-02-25 Thread Michael Lewis
On Mon, Feb 25, 2019 at 2:44 AM kimaidou  wrote:

> I have better results with this version. Basically, I run a first query
> only made for aggregation, and then do a JOIN to get other needed data.
>
> * SQL : http://paste.debian.net/1070007/
> * EXPLAIN: https://explain.depesz.com/s/D0l
>
> Not really "fast", but I gained 30%
>


It still seems that disk sort and everything after that is where the query
plan dies. It seems odd that it went to disk if work_mem was already 250MB.
Can you allocate more as a test? As an alternative, if this is a frequently
needed data, can you aggregate this data and keep a summarized copy updated
periodically?


Re: Query slow for new participants

2019-02-26 Thread Michael Lewis
> Indexes:
> "pk_pricedb" PRIMARY KEY, btree (companyid, articleid)
> "EnabledIndex" btree (enabled)
> "ix_companyarticledb_article" btree (articleid)
> "ix_companyarticledb_company" btree (companyid)
>

I'd say drop ix_companyarticledb_company since pk_pricedb can be used
instead even if other queries are only on companyid field, and it will be
faster for this case certainly since it targets the row you want directly
from the index without the *"Rows Removed by Filter: 2674361"*

I doubt the default_statistics_target = 100 default is doing you any
favors. You may want to try increasing that to 500 or 1000 if you can
afford a small increase in planning cost and more storage for the bigger
sampling of stats.


Re: Shared_buffers

2019-03-12 Thread Michael Lewis
On Tue, Mar 12, 2019 at 2:29 AM Laurenz Albe 
wrote:

> Daulat Ram wrote:
> > I want to know about the working and importance of shared_buffers  in
> Postgresql?
> > is it similar to the oracle database buffer cache?
>
> Yes, exactly.
>
> The main difference is that PostgreSQL uses buffered I/O, while Oracle
> usually
> uses direct I/O.
>
> Usually you start with shared_buffers being the minimum of a quarter of the
> available RAM and 8 GB.
>

Any good rule of thumb or write up about when shared buffers in excess of
8GBs makes sense (assuming system ram 64+ GBs perhaps)?


Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction

2019-06-20 Thread Michael Lewis
For kicks I tried the example given and got the below which seems more
expected.


explain analyze select * from brin_test where id >= 9;

Bitmap Heap Scan on brin_test  (cost=5.78..627.36 rows=9861 width=8)
(actual time=0.373..7.309 rows=10001 loops=1)
  Recheck Cond: (id >= 9)
  Rows Removed by Index Recheck: 3215
  Heap Blocks: lossy=59
  ->  Bitmap Index Scan on idx_brin_test_brin  (cost=0.00..3.32 rows=14286
width=0) (actual time=0.018..0.019 rows=640 loops=1)
Index Cond: (id >= 9)
Planning Time: 0.101 ms
Execution Time: *13.485 ms*


explain analyze select * from brin_test where id >= 9 and r in (1,3);

Bitmap Heap Scan on brin_test  (cost=3.36..553.50 rows=197 width=8) (actual
time=0.390..1.829 rows=200 loops=1)
  Recheck Cond: ((id >= 9) AND (r = ANY ('{1,3}'::integer[])))
  Rows Removed by Index Recheck: 13016
  Heap Blocks: lossy=59
  ->  Bitmap Index Scan on idx_brin_test_brin  (cost=0.00..3.31 rows=7143
width=0) (actual time=0.026..0.027 rows=1280 loops=1)
Index Cond: ((id >= 9) AND (r = ANY ('{1,3}'::integer[])))
Planning Time: 0.089 ms
Execution Time: *1.978 ms*


Re: EXPLAIN ANALYZE of BRIN bitmap index scan with disjunction

2019-06-20 Thread Michael Lewis
I ran both many times and got the same result. ::shrug::


Re: Optimizing `WHERE x IN` query

2019-07-11 Thread Michael Lewis
Did you create a GIN index on subscriptions column to support the &&
operator?


Re: Searching in varchar column having 100M records

2019-07-19 Thread Michael Lewis
On Fri, Jul 19, 2019 at 8:13 AM mayank rupareliya 
wrote:

> Well, you haven't shown us the execution plan, so it's hard to check why
> it did not help much and give you further advice.
>
>
> This is the latest query execution with explain after adding indexing on
> both columns.
>
> Aggregate  (cost=174173.57..174173.58 rows=1 width=8) (actual
> time=65087.657..65087.658 rows=1 loops=1)
>   ->  Bitmap Heap Scan on fields  (cost=1382.56..174042.61 rows=52386
> width=0) (actual time=160.340..65024.533 rows=31857 loops=1)
> Recheck Cond: ((field)::text = 'Champlin'::text)
> Heap Blocks: exact=31433
> ->  Bitmap Index Scan on index_field  (cost=0.00..1369.46
> rows=52386 width=0) (actual time=125.078..125.079 rows=31857 loops=1)
>   Index Cond: ((field)::text = 'Champlin'::text)
> Planning Time: 8.595 ms
> Execution Time: 65093.508 ms
>
>>
>>

Are you on a solid state drive? If so, have you tried setting
effective_io_concurrency to 200 or 300 and checking performance? Given
nearly all of the execution time is doing a bitmap heap scan, I wonder
about adjusting this.

https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR
effective_io_concurrency
"The allowed range is 1 to 1000, or zero to disable issuance of
asynchronous I/O requests. Currently, this setting only affects bitmap heap
scans."
"The default is 1 on supported systems, otherwise 0. "


Re: High concurrency same row (inventory)

2019-07-29 Thread Michael Lewis
Can you share the schema of the table(s) involved and an example or two of
the updates being executed?


Re: Planner performance in partitions

2019-08-12 Thread Michael Lewis
"It is also important to consider the overhead of partitioning during query
planning and execution. The query planner is generally able to handle
partition hierarchies with *up to a few hundred partitions fairly well*,
provided that typical queries allow the query planner to prune all but a
small number of partitions. Planning times become longer and memory
consumption becomes higher as more partitions are added." (emphasis added)

--https://www.postgresql.org/docs/current/ddl-partitioning.html


Re: Planner performance in partitions

2019-08-12 Thread Michael Lewis
Thanks for clarifying your position and sharing the results you have seen.
That is impressive indeed.

It seems likely that waiting for v12 is needed since feature are not back
patched. Perhaps one of the contributors will confirm, but that is my
expectation.


Re: Last event per user

2019-08-12 Thread Michael Lewis
The obfuscation makes it difficult to guess at the query you are writing
and the schema you are using. Can you provide any additional information
without revealing sensitive info?

1) Do you have an index on ( user_id ASC, timestamp_inc DESC ) ?
2) Sub-queries can't be re-written inline by the optimizer when there is an
aggregate inside the subquery, and I think DISTINCT ON would behave the
same. So, that might explain the significant change in behavior when the
lateral is used. I am guessing at how you wrote the two versions of the
view though.

Obviously not best design, but you could insert events as "is_latest" and
update any prior events for that user via trigger as is_latest = false.


Re: Last event per user

2019-08-12 Thread Michael Lewis
It seems like it should be-
SELECT * FROM users u JOIN last_user_event_1 e USING (user_id,user_group);
--OR--
SELECT * FROM last_user_event_2 e;

for them to produce the same result set, since the last_user_event_2
already (could) have users info in it very simply by select * instead of
e.* in that view definition.

Are there other important joins/where/order by/limits that would be on this
"main query" that is just SELECT * FROM  right now which you have
dropped to try to simplify the example?


Re: Planner performance in partitions

2019-08-12 Thread Michael Lewis
Was there a reason to exceed 100-500 partitions in real life that pushed
you to do this test? Is there some issue you see when using 100 partitions
that is solved or reduced in severity by increasing to 1200 or 6000
partitions?


Re:

2019-08-12 Thread Michael Lewis
If you modify last_user_event_2 to select user and event info in the view,
and just put there where clause directly on the view which is not joined to
anything, instead of on the "extra copy" of the users table like you were
showing previously, I would expect that the performance should be excellent.

>


Re: Last event per user

2019-08-12 Thread Michael Lewis
On Mon, Aug 12, 2019 at 5:03 PM Luís Roberto Weck <
[email protected]> wrote:

> If you modify last_user_event_2 to select user and event info in the view,
> and just put there where clause directly on the view which is not joined to
> anything, instead of on the "extra copy" of the users table like you were
> showing previously, I would expect that the performance should be excellent.
>
> But I need user_id and user_group to be outside of the view definition.
> user_id and user_group are dynamic values, as in, I need to call this query
> multiple times for different user_ids and user_groups .
>

I don't follow. Perhaps there is something within the limitations of the
ORM layer that I am not expecting. If you have this view-

"last_user_event_2"

SELECT u.*, e.*

   FROM users u

JOIN LATERAL (SELECT *

FROM events

   WHERE user_id = u.user_id

 AND user_group = u.user_group

   ORDER BY timestamp_inc DESC

   LIMIT 1 ) e ON TRUE


And you execute a query like this-
SELECT * FROM last_user_event_2 e WHERE  user_id = 1272897 and user_group =
19117;

Then I would expect very good performance.


Re: Extremely slow HashAggregate in simple UNION query

2019-08-20 Thread Michael Lewis
I believe this would be relevant-
https://www.cybertec-postgresql.com/en/optimizer-support-functions/

It seems there is hope down the road to improve those estimates.


Re: Extremely slow count (simple query, with index)

2019-08-22 Thread Michael Lewis
>
> You can SELECT reltuples FROM pg_class WHERE oid='subscriptions'::oid, but
> its
> accuracy depends on frequency of vacuum (and if a large delete/insert
> happened
> since the most recent vacuum/analyze).
>

This only seems helpful to find approx. count for the entire table, without
considering the WHERE condition.

Marco,
As Justin pointed out, you have most of your time in the bitmap heap scan.
Are you running SSDs? I wonder about tuning effective_io_concurrency to
make more use of them.

"Currently, this setting only affects bitmap heap scans."
https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-ASYNC-BEHAVIOR

Also, how many million rows is this table in total? Have you considered
partitioning?


Re: select distinct runs slow on pg 10.6

2019-09-09 Thread Michael Lewis
If you can't modify the query, then there is nothing more to be done to
optimize the execution afaik. Distinct is much slower than group by in
scenarios like this with many columns. You already identified the disk sort
and increased work mem to get it faster by 3x. There are not any other
tricks of which I am aware.


Re: pg12 - migrate tables to partitions structure

2019-09-18 Thread Michael Lewis
 How many rows are you dealing with currently? What are your queries like?
Have you looked at doing a hash partition on product.id? Is this on a test
system or destined for a production environment in the near future? I ask
because PG12 is still in beta.


Re: pg12 - migrate tables to partitions structure

2019-09-18 Thread Michael Lewis
>
> All the queries uses the vendor product and thats why this column is a
> perfect fit as a partition column.
> My main table is big (10M+) (Product), but other tables can also be
> big(1M+)..
>

I assume you have query performance problems and are hoping partitioning
will help? Are you read heavy, or write intensive, or both? 10 million
rows, especially if they aren't super wide, doesn't seem like a huge number
to me. Do you have example queries with explain plans that you think would
benefit from the system being partitioned? I just know that as an engineer,
sometimes I like to make use of new tools, even when it isn't the best
solution for the problem I am actually experiencing. How confident are you
that you NEED partitions is my real question.


Re: pg12 - migrate tables to partitions structure

2019-09-18 Thread Michael Lewis
Is this being done because it can be, or is it solving a real-life pain
point? Just wondering what the perspective is here.

Much of partitioning strategy seems to me to revolve around how the system
is used, and not just the schema and what is possible. For instance, you
can mimic primary and foreign key behavior with triggers as described here,
and that would bypass some of the restrictions on what can be done.
https://www.depesz.com/2018/10/02/foreign-key-to-partitioned-table/

This would allow you to change out the primary key for a simple index
perhaps, and partition however you want. Just because something can be
done, does not mean it should be.

>


Re: Slow query on a one-tuple table

2019-09-19 Thread Michael Lewis
Is this result able to be repeated?


Re: Slow query on a one-tuple table

2019-09-19 Thread Michael Lewis
>
> I have about 6 bigint fields in this table that are very frequently
> updated, but none of these are indexed. I thought that by not having an
> index on them, would make all updates HOT, therefore not bloating the
> primary key index. Seems I was wrong?
>

HOT update is only possible if there is room in the page. How wide is your
single tuple?

Have you tuned autovacuum or are you running defaults? Not sure of your
perception of "very frequently" updated values, but if you have bloat
issue, vacuum early and often. Not sure how the math works out on a table
with single tuple in terms of calculating when it is time to vacuum, but it
certainly needs to be tuned differently than a table with millions of rows
which is what I would be more used to.


Re: pg12 - partition by column that might have null values

2019-10-03 Thread Michael Lewis
Just create a partial index on id column where end_date = infinity (or null
if you really prefer that pattern) and the system can quickly find the rows
that are still most current revision. How many rows do you have in this
table? Or foresee ever having? What took you down the road of partitioning
the table? Theory only, or solving a real life optimization problem?


Re: Query went slow all of sudden. ON V 11.3

2019-10-04 Thread Michael Lewis
What are approx row counts and distribution of data in the concerned tables
and columns? Have you run EXPLAIN (query plan) to get the plan that will be
executed and can you paste on https://explain.depesz.com/ and share the
link that results?

Do you have an index on LOWER( cr ) on table temp_10032019020721_4470?
Do you have an index on LOWER( c_pagealias ) on table t_ages?
Do you have an index on LOWER(  c_id ) on table t_meners?

If temp_10032019020721_4470 is truly temp table, was it analyzed after
creating/inserting/updating/deleting data last, so that the optimizer knows
the number of distinct values, how many rows, most common values, etc?


Re: distinct on extract returns composite type

2019-10-07 Thread Michael Lewis
As long as we are on the performance list and not general, it might be
worth noting that partitioning should be defined directly on the data and
not on a function result I believe. If you always do the extract year and
extract quarter thing, it may work out just fine. But just a regular btree
index on the date/timestamp/timestamptz field and partitions like the below
might be much easier to work with.

MINVALUE to 2018-01-01 /* the top end is always exclusive so it gets
referenced as top on this partition and start of the next partition */
2018-01-01 to 2018-04-01
2018-04-01 to 2018-07-01
2018-07-01 to 2018-10-01
2018-10-01 to 2019-01-01
2019-01-01 to 2019-04-01
2019-04-01 to 2019-07-01
2019-07-01 to 2019-10-01
2019-10-01 to 2020-01-01
2020-01-01 to MAXVALUE


Re: Query slows when used with view

2019-10-09 Thread Michael Lewis
>
> Those are not equivalent queries.  Read up on the syntax of FROM;
> particularly, that JOIN binds more tightly than comma.
>

I see this-

"A JOIN clause combines two FROM items, which for convenience we will refer
to as “tables”, though in reality they can be any type of FROM item. Use
parentheses if necessary to determine the order of nesting. In the absence
of parentheses, JOINs nest left-to-right. In any case JOIN binds more
tightly than the commas separating FROM-list items."
https://www.postgresql.org/docs/current/sql-select.html

What is meant by nesting? Or binding for that matter? I wouldn't expect
increasing from/join_collapse_limit to be helpful to the original poster
since they haven't exceeded default limit of 8. Any further clarification
elsewhere you could point to?


Re: Query slows when used with view

2019-10-09 Thread Michael Lewis
>
> When you join to a view, the view sticks together, as if they were all in
> parentheses.   But when you substitute the text of a view into another
> query, then they are all on the same level and can be parsed differently.
>
> Consider the difference between "1+1 * 3", and "(1+1) * 3"
>

I thought from_collapse_limit being high enough meant that it will get
re-written and inlined into the same level. To extend your metaphor, that
it would be 1 * 3 + 1 * 3.


Re: Query slow again after adding an `OR` operation (was: Slow PostgreSQL 10.6 query)

2019-10-09 Thread Michael Lewis
Are you prefixing this auto generated query with set join_collapse_limit =
30, or are you changing the default and reloading config? That is, can you
impact ONLY this query with these config changes? I wouldn't assume so, so
any hack/query hint like turning off hashjoins (which seem to be chosen
instead of nested loop because of bad estimates for this plan) will likely
have serious impact on other queries.

I know you don't have the flexibility to change the query to be one that
follows best practices, but it is a bit disappointing that your ORM
generates that OR condition instead of something like *brs.branch_id IN
(query1 union all query2). *The join to branch_invoices also must function
as inner join rather than left, but I am not sure if declaring a join type
as left impacts the performance significantly.

When performance matters, there's nothing quite like being able to
customize the query directly.


Reading explain plans- row estimates/actuals on lower nodes vs next level up

2019-10-17 Thread Michael Lewis
https://explain.depesz.com/s/Caa5

I am looking at this explain analyze output and seeing a nested loop
towards the lowest levels with pretty bad estimate vs actual (2.3k vs 99k),
but the things that feed that nested loop seem like the estimates are
rather close (index scans with 11 estimated vs 30 actual and 3350 vs
3320)... why does the higher node have such a different estimate vs actual
ratio?


*Michael Lewis  |  Software Engineer*
*Entrata*


Re: Slow planning, fast execution for particular 3-table query

2019-11-06 Thread Michael Lewis
Is default_statistics_target set above default 100? I would assume that
would reflect in the size of pg_statistic, but wanted to ask since
increasing that from 100 to 1000 was the only time I have seen planning
time explode. Are other queries slow to plan?


Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
What's the plan for the slow one? What's the time to just count all rows?

>


Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
It is very interesting to me that the optimizer chose a parallel sequential
scan rather than an index scan on either of your indexes that start
with project_id that also reference trashed_at.

1) Are you running on SSD type storage? Has random_page_cost been lowered
to 1-1.5 or so (close to 1 assumes good cache hits)?
2) It seems you have increased parallel workers. Have you also changed the
startup or other cost configs related to how inclined the system is to use
sequential scans?
3) If you disable sequential scan, what does the plan look like for this
query? (SET ENABLE_SEQSCAN TO OFF;)

>


Re: Slow "not in array" operation

2019-11-12 Thread Michael Lewis
Odd index choice by the optimizer given what is available. The bitmap being
lossy means more work_mem is needed if I remember properly.

It is interesting that skipping the where condition on the array is only
half a second. Is the array being toasted or is it small and being stored
in the same file as primary table?

What is the result for this count query? Is it roughly 4 million?


On Tue, Nov 12, 2019, 1:06 PM Marco Colli  wrote:

> 1) It is running on a DigitalOcean CPU-optimized droplet with dedicated
> hyperthreads (16 cores) and SSD.
> SHOW random_page_cost; => 2
>
> 2) What config names should I check exactly? I used some suggestions from
> the online PGTune, when I first configured the db some months ago:
> max_worker_processes = 16
> max_parallel_workers_per_gather = 8
> max_parallel_workers = 16
>
> 3) Here's the query plan that I get after disabling the seq scan:
>
>
>   QUERY PLAN
>
>
>
> ---
>
>  Finalize Aggregate  (cost=2183938.89..2183938.90 rows=1 width=8) (actual
> time=94972.253..94972.254 rows=1 loops=1)
>
>->  Gather  (cost=2183938.16..2183938.87 rows=7 width=8) (actual
> time=94952.895..95132.626 rows=8 loops=1)
>
>  Workers Planned: 7
>
>  Workers Launched: 7
>
>  ->  Partial Aggregate  (cost=2182938.16..2182938.17 rows=1
> width=8) (actual time=94950.958..94950.958 rows=1 loops=8)
>
>->  Parallel Bitmap Heap Scan on subscriptions
> (cost=50294.50..2180801.47 rows=854677 width=0) (actual
> time=1831.342..94895.208 rows=611828 loops=8)
>
>  Recheck Cond: ((project_id = 123) AND (trashed_at IS
> NULL))
>
>  Rows Removed by Index Recheck: 2217924
>
>  Filter: (NOT (tags @> '{en}'::character varying[]))
>
>  Rows Removed by Filter: 288545
>
>  Heap Blocks: exact=120301 lossy=134269
>
>  ->  Bitmap Index Scan on
> index_subscriptions_on_project_id_and_tags  (cost=0.00..48798.81
> rows=6518094 width=0) (actual time=1493.823..1493.823 rows=7203173 loops=1)
>
>Index Cond: (project_id = 123)
>
>  Planning Time: 1.273 ms
>
>  Execution Time: 95132.766 ms
>
> (15 rows)
>
>
> On Tue, Nov 12, 2019 at 8:20 PM Michael Lewis  wrote:
>
>> It is very interesting to me that the optimizer chose a parallel
>> sequential scan rather than an index scan on either of your indexes that
>> start with project_id that also reference trashed_at.
>>
>> 1) Are you running on SSD type storage? Has random_page_cost been lowered
>> to 1-1.5 or so (close to 1 assumes good cache hits)?
>> 2) It seems you have increased parallel workers. Have you also changed
>> the startup or other cost configs related to how inclined the system is to
>> use sequential scans?
>> 3) If you disable sequential scan, what does the plan look like for this
>> query? (SET ENABLE_SEQSCAN TO OFF;)
>>
>>>


Re: Simple DELETE on modest-size table runs 100% CPU forever

2019-11-14 Thread Michael Lewis
> If I leave out the "analyze", here's what I get (note that the
> categories_staging_N table's name changes every time; it's
> created on demand as "create table categories_staging_n(id integer)").
>

How/when are they created? In the same statement? After create, are you
analyzing these tables? If not, the optimizer is blind and may be choosing
a bad plan by chance.


Re: Re[4]: Postgresql planning time too high

2019-11-22 Thread Michael Lewis
As a matter of habit, I put all inner joins that may limit the result set
as the first joins, then the left joins that have where conditions on them.
I am not sure whether the optimizer sees that only those tables are needed
to determine which rows will be in the end result and automatically
prioritizes them as far as joins. With 40+ joins, I would want if this
re-ordering of the declared joins may be significant.

If that doesn't help, then I would put all of those in a sub-query to break
up the problem for the optimizer (OFFSET 0 being an optimization fence,
though if this is an example of "simple" pagination then I assume but am
not sure that OFFSET 20 would also be an optimization fence). Else, put all
that in a CTE with MATERIALIZED keyword when on v12 and without on 9.5
since it did not exist yet and was default behavior then.

With an empty database, there are no statistics so perhaps the optimizer
has too many plans that are very close in expected costs. I'd be curious if
the planning time gets shorter once you have data, assuming
default_statistics_target is left at the standard 100, or is not increased
too hugely.

>


Re: Make recently inserted/updated records available in the buffer/cache

2019-12-03 Thread Michael Lewis
"I am going to use it as a queue"

You may want to look at lowering fillfactor if this queue is going to have
frequent updates, and also make autovacuum/analyze much more aggressive
assuming many updates and deletes.


Re: Make recently inserted/updated records available in the buffer/cache

2019-12-03 Thread Michael Lewis
On Tue, Dec 3, 2019 at 11:46 AM Sachin Divekar  wrote:

> I am also going to use SKIP LOCKED to _select for update_. Any suggestions
> on tuning parameters for SKIP LOCKED?
>

I am not aware of any. Either you use it because it fits your need, or not.

Note- please don't top-post (reply and include all the previous
conversation below) on the Postgres mailing lists. Quote only the part(s)
you are responding to and reply there.


Re: autovacuum locking question

2019-12-05 Thread Michael Lewis
On Thu, Dec 5, 2019 at 3:26 PM Mike Schanne  wrote:

> I am concerned that if the autovacuum is constantly canceled, then the
> table never gets cleaned and its performance will continue to degrade over
> time.  Is it expected for the vacuum to be canceled by an insert in this
> way?
>
>
>
> We are using postgres 9.6.10.
>

Have you checked when the table was last autovacuumed in
pg_stat_user_tables? If the autovacuum count is high and timestamp of last
run is relatively current, then no reason for concern as far as I can
figure.

Have you already configured (non-default values) for autovacuum options for
your system or this table?


Re: Specific query taking time to process

2019-12-09 Thread Michael Lewis
>
> There is a specific search query I am running to get list of Documents and
> their metadata from several table in the DB.
> We are running Postgres 9.6.9 on Amazon RDS (db.m5.4xlarge instance)
>
> Our current DB consists of 500GB of data and indexes. Most of the rows in
> table are consist of 454,078,915
>
> With the fresh DB with the restore of the DATA without any indexes Search
> query performs relatively quick and most of the time its less than a
> second.
>
> But after 3 weeks of use of the DB it sudenly started to slowdown only for
> this perticular query and it takes 20+ seconds to respond. If I do a
> restore the DB again then it continues to work fine and the symptom pops
> out after 3 weeks time.
>


You haven't been quite clear on the situation and your use case, but
assuming this table has 454 million rows and experiences updates/deletes
then this sounds like you may be having problems with autovacuum. Have you
customized parameters to ensure it is running more frequently than default?
How are you doing those data restores? Perhaps that process is cleaning up
the accumulated bloat and you can run fine again for a while. Check
pg_stat_user_tables for the last (auto)vacuum that ran, assuming you didn't
just restore again and are expecting the issue to occur again soon.


Re: Specific query taking time to process

2019-12-10 Thread Michael Lewis
On Mon, Dec 9, 2019 at 3:39 PM Fahiz Mohamed  wrote:

> I ran "explain analyse" on this query and I got following result. (We have
> 2 identical DB instances and they consist of same data. Instane 1 took 20+
> second to process and instance 2 took less than a second)
>
> Instance 1: (This is used by regular User - More than 600,000 request a
> day) - The result is same even when there is no user in the server.
>
>  -> Bitmap Heap Scan on alf_node node (cost=995009.97..3303978.85 
> rows=4565737 width=8) (actual time=3304.419..20465.551 rows=41109751 loops=1)
> Recheck Cond: ((store_id = 6) AND (type_qname_id = 240))
> Rows Removed by Index Recheck: 54239131
> Filter: (NOT (hashed SubPlan 1))
> Rows Removed by Filter: 2816
> Heap Blocks: exact=24301 lossy=1875383
>
>
> Planning time: 0.639 ms
> Execution time: 22946.036 ms
>
>
https://paquier.xyz/postgresql-2/postgres-9-4-feature-highlight-lossyexact-pages-for-bitmap-heap-scan/


That seems like a lot of lossy blocks. As I understand it, that means the
system didn't have enough work_mem to fit all the references to the
individual rows which perhaps isn't surprising when it estimates it needs
4.5 million rows and ends up with 41 million.

Do both DB instances have the same data? I ask because the two plans are
rather different which makes me think that statistics about the data are
not very similar. Are both configured the same, particularly for
shared_buffers and work_mem, as well as the various planning cost
parameters like random_page cost? If you can provide these plans again with
explain( analyze, buffers ) this time? Did you check on the last time
autovacuum ran in pg_stat_user_tables?


Re: Specific query taking time to process

2019-12-11 Thread Michael Lewis
This seems beyond me at this point, but I am curious if you also
vacuumed alf_node_properties and alf_node tables and checked when they last
got (auto)vacuumed/analyzed. With default configs for autovacuum parameters
and tables with that many rows, they don't qualify for autovacuum very
often. I don't have much experience with tables in excess of 50 million
rows because of manual sharding clients data.

You mention work_mem is set differently. Did you try setting work_mem back
to 4MB in session on instance 1 just to test the query? I don't know if
work_mem is included in planning stage, but I would think it may be
considered. It would be odd for more available memory to end up with a
slower plan, but I like to eliminate variables whenever possible.

It might be worthwhile to see about increasing default_statistics_target to
get more specific stats, but that can result in a dramatic increase in
planning time for even simple queries.

Hopefully one of the real experts chimes in.


Re: Seeking reason behind performance gain in 12 with HashAggregate

2020-01-13 Thread Michael Lewis
I am not at all familiar with PostGIS so perhaps this is a silly question,
is bloat an issue on the older instance? Correlation isn't causation, but
half the buffers scanned and half the runtime in the v12 plan has me
curious why that might be.

>


Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters

2020-01-16 Thread Michael Lewis
Does the behavior change with different values of Ver column? I'd be
curious of the fraction in the MCVs frequency list in stats indicates that
rows with Ver = 92 are rare and therefore the index on only Ver column is
sufficient to find the rows quickly. What is reltuples for this table by
the way?

I also wonder if the situation may be helped by re-indexing the "index on
both columns" to remove any chance of issues on bloat in the index. Which
order are the columns by the way? If Ver is first, is there also an index
on only id column?. Since you aren't on v12, you don't get to re-index
concurrently but I assume you know the work around of create concurrently
(different name), drop concurrently (old one), and finally rename new index.


Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters

2020-01-16 Thread Michael Lewis
"Finally I ran "ANALYZE" again and now the problem went away. Running the
query with Ver=92 uses the proper plan. I'm not happy with this - I know I
haven't solved the problem (I've ran ANALYZE multiple times before)."

Does 92 appear in MCVs list with that new sampling? I wonder if
default_statistics_target should be increased a bit to help ensure a
thorough sample of the data in this table. Note- don't go too high (maybe
250, not 1000) or planning time can increase significantly. Also, perhaps
only increase on this Ver column.

What is the real frequency of value 92? With default_statistics_target =
100, analyze takes 100*300 rows as sample, and if it is missed in that 30k
rows set, or very very small when in fact it has equal weight with other
values, then during planning time it is expected to be very very rare when
in fact it is only slightly less common than the others in the list. If the
others in the list are expected to be 100% of the table as you showed with
the query to compute "frac_MCV" from pg_stats for that column, then perhaps
the optimizer is wise to scan only the LucrareBugetVersiuneId column of the
composite index and filter in memory.

Curious, when you get bad plans (re-analyze the table repeatedly to get new
samples until the wrong plan is chosen), what does PG estimate for total
rows returned with ONLY LucrareBugetVersiuneId = 92 as the where condition?

Note- Tom & Laurenz are real experts. I might have no idea what I am doing
yet. It is too early to say.

On Thu, Jan 16, 2020 at 11:15 AM Cosmin Prund  wrote:

> Hello Michael and hello again Tom, sorry for mailing you directly. I just
> hit Reply in gmail - I expected the emails to have a reply-to=Pgsql.
> Apparently they do not.
>
> Running the same query with a different "Ver" produces a proper plan.
> Here's a non-redacted example (Ver=91):
>
> EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from
> "LucrareBugetDate" where ("LucrareBugetVersiuneId" = 91) and
> ("LucrareBugetDateId" in (10,11));
>
>QUERY PLAN
>
> -
>  Index Scan using
> "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on
> "LucrareBugetDate"  (cost=0.56..4.95 rows=2 width=13) (actual
> time=3.617..3.631 rows=2 loops=1)
>Index Cond: (("LucrareBugetVersiuneId" = 91) AND ("LucrareBugetDateId"
> = ANY ('{10,11}'::integer[])))
>Buffers: shared hit=9 read=3
>  Planning time: 0.223 ms
>  Execution time: 3.663 ms
> (5 rows)
>
> I have reindex everything, not just this INDEX.
>
> "reltuples" for this table is 41712436.
>
> > I'd be curious of the fraction in the MCVs frequency list in stats
> indicates that rows with Ver = 92 are rare and therefore the index on
> only Ver column is sufficient to find the rows quickly.
>
> There are 25 valid values for "Ver" in this database. I ran the query for
> all of them. The only one miss-behaving is "92". I ran the query with
> random values for Ver (invalid values), the query plan always attempts to
> use the index using both values.
> I looked into "most_common_values" in pg_stats, this value (92) is not in
> that list.
> Finally I ran "ANALYZE" again and now the problem went away. Running the
> query with Ver=92 uses the proper plan. I'm not happy with this - I know I
> haven't solved the problem (I've ran ANALYZE multiple times before).
>
>
> On Thu, 16 Jan 2020 at 19:00, Michael Lewis  wrote:
>
>> Does the behavior change with different values of Ver column? I'd be
>> curious of the fraction in the MCVs frequency list in stats indicates that
>> rows with Ver = 92 are rare and therefore the index on only Ver column is
>> sufficient to find the rows quickly. What is reltuples for this table by
>> the way?
>>
>> I also wonder if the situation may be helped by re-indexing the "index on
>> both columns" to remove any chance of issues on bloat in the index. Which
>> order are the columns by the way? If Ver is first, is there also an index
>> on only id column?. Since you aren't on v12, you don't get to re-index
>> concurrently but I assume you know the work around of create concurrently
>> (different name), drop concurrently (old one), and finally rename new index.
>>
>


Re: Query optimization advice for beginners

2020-01-27 Thread Michael Lewis
You've got two references to order_basketitemdetail both aliased to bid and
ALSO a table called order_basketitembatch aliased to bib. I assume that
confuses the planner, but even if it doesn't it certainly confuses any new
developers trying to understand the query's intention.

The biggest thing that leaps out at me on the explain plan is the 822
thousand loops on index order_basketitembatch_detail_id_9268ccff. That
seems to be the subquery in the where clause of the subquery in the main
where clause. I never get great results when I nest sub-queries multiple
levels. Without knowing your data, we can only make guesses about
restructuring the query so it performs better.

select bi.id AS basketitem_id --coalesce(sum(bid.quantity), 0)
  from order_basketitem bi
   --, order_basketitemdetail bid
   , order_order o
  where o.type in (2,7,9) and o.id = bi.order_id
  and o.is_cancelled = false
  and bi.is_cancelled = false
  and o.is_closed = false
  and o.is_picked = false
  and o.is_invoiced = false
  and o.is_sent = false
  --and bi.id = bid.basketitem_id

For a query like the above, how restrictive is it? That is, of ALL the
records in order_basketitem table, how many are returned by the above
condition? I would think that the number of orders that have been picked or
invoiced or sent or closed or cancelled would be LARGE and so this query
may eliminate most of the orders from being considered. Not to mention the
order type id restriction.

If I found that the above query resulted in 1% of the table being returned
perhaps, there are a number of ways to influence the planner to do this
work first such as-

1) put this in a sub-query as the FROM and include OFFSET 0 hack to prevent
in-lining
2) put in a CTE using the WITH keyword (note- need to use MATERIALIZED
option once on PG12 since default behavior changes)
3) if the number of records returned is large (10 thousand maybe?) and the
optimizer is making bad choices on the rest of the query that uses this
result set, put this query into a temp table, analyze it, and then use it.

>


Re: How to avoid UPDATE performance degradation in a transaction

2020-02-14 Thread Michael Lewis
If your trigger is supposed to change certain fields, you could return OLD
instead of NEW if those fields have not been changed by the trigger. You
could also check an updated_on timestamp field to verify if the row has
already been modified and potentially skip the trigger altogether. Just a
couple thoughts to avoid the bloat.


Re: much slower query in production

2020-02-26 Thread Michael Lewis
By the way, I expect the time is cut in half while heap fetches stays
similar because the index is now in OS cache on the second run and didn't
need to be fetched from disk. Definitely need to check on vacuuming as
Justin says. If you have a fairly active system, you would need to run this
query many times in order to push other stuff out of shared_buffers and get
this query to perform more like it does on dev.

Do you have the option to re-write the query or is this generated by an
ORM? You are forcing the looping as I read this query. If you aggregate
before you join, then the system should be able to do a single scan of the
index, aggregate, then join those relatively few rows to the multicards
table records.

SELECT transaction_uid, COALESCE( sub.count, 0 ) AS count FROM multicards
LEFT JOIN (SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY
multicard_uid ) AS sub ON sub.multicard_uid = multicards.uid;


Re: much slower query in production

2020-02-26 Thread Michael Lewis
>
> UPDATE multicards
>SET defacements = COALESCE( count, 0 )
>   FROM ( SELECT multicard_uid, COUNT(*) AS count FROM tickets GROUP BY
> multicard_uid ) AS sub
>  WHERE uid = multicard_uid OR multicard_uid is null;
>

I expect this should work. Not sure of performance of course.


Re: Many DataFileRead - IO waits

2020-02-27 Thread Michael Lewis
How big is ix_tabledata_intid_timestampdate_intotherid3_intotherid2 on
disk? If you create another index with same fields, how much space does it
take? Real question- are you vacuuming aggressively enough for your
workload? Your index name seems to indicate that intotherid3 would be the
third key, and yet the planner chose not to scan that deep and instead
filtered after it found the relevant tuples based on intid and
timestampdate. That seems peculiar to me.

The documentation discourages multi-column indexes because they have
limited application unless the same fields are always used. Personally, I
don't love reviewing the stats of indexscans or how many tuples were
fetched and having to guess how deeply the index was scanned for the
various queries involved.

I'd wonder if an index on only intid_timestampdate would be both much
smaller and also have a more right-leaning pattern of information being
added and accessed in terms of keeping frequently needing blocks in shared
buffers.

As a side note, that planning time seems high to me for such a simple
query. Have you increased default_statistics_target significantly perhaps?

>


Re: Many DataFileRead - IO waits

2020-02-28 Thread Michael Lewis
If no updates or deletes are happening on the table, it would be best
practice to set up a scheduled manual vacuum analyze to ensure statistics
and the visibility map is updated. Other than creating the index on the
first two columns only, I'm out of ideas. Hopefully someone running
Postgres at large scale on Windows will chime in.


Re: JOIN on partitions is very slow

2020-03-22 Thread Michael Lewis
Are you able to tweak the query or is that generated by an ORM? What
version of Postgres? Which configs have you changed from default? How many
partitions do you have? Is there an index on company name?

Anytime I see distinct keyword, I expect it to be a performance bottleneck
and wonder about rewriting the query. Even just using group by can be much
faster because of how it gets executed.


Re: JOIN on partitions is very slow

2020-03-23 Thread Michael Lewis
On Mon, Mar 23, 2020 at 1:40 AM daya airody  wrote:

> Yes. I can tweak the query. Version of postgres is 9.5.15. I have about 20
> partitions for company_sale_account table.
> I do have an index on company name.
>
> I need to use DISTINCT as i need to remove the duplicates.
>

DISTINCT is a sign of improper joins most of the time in my experience.
Often, just changing to group by is faster

SELECT cpsa1.*
FROM company_sale_account cpsa1
 JOIN  company_sale_account cpsa2  ON cpsa1.sale_account_id =
cpsa2.sale_account_id
 WHERE  cpsa1.company_name = 'company_a'
 AND cpsa2.company_name = 'company_b'
GROUP BY cpsa1.id; --assuming primary key exists, and I forget if the
feature that allows only naming primary key in group by might have been
introduced with 9.6

It should be noted that 9.5 is about 1 year from being EOL'd so it would be
prudent to update to v11 or 12 when possible.

How does the below query perform? By the way, "top posting" (replying with
all previous email thread below your reply) is discouraged on these forums.
It makes the reviewing archived posts more cumbersome. Instead, please
reply with only your message and copying the relevant parts of prior
conversation that you are responding to.

SELECT cpsa1.*
FROM company_sale_account cpsa1
WHERE cpsa1.company_name = 'company_a' AND EXISTS(SELECT FROM
company_sale_account cpsa2 WHER cpsa1.sale_account_id =
cpsa2.sale_account_id AND cpsa2.company_name = 'company_b' );


Re: Best way to delete big amount of records from big table

2020-03-27 Thread Michael Lewis
If you can afford the time, I am not sure the reason for the question. Just
run it and be done with it, yes?

A couple of thoughts-
1) That is a big big transaction if you are doing all the cleanup in a
single function call. Will this be a production system that is still online
for this archiving? Having a plpgsql function that encapsulates the work
seems fine, but I would limit the work to a month at a time or something
and call the function repeatedly. Get the min month where records exist
still, delete everything matching that, return. Rinse, repeat.
2) If you are deleting/moving most of the table (91 of 150 million),
consider moving only the records you are keeping to a new table, renaming
old table, and renaming new table back to original name. Then you can do
what you want to shift the data in the old table and delete it.


Re: PostgreSQL does not choose my indexes well

2020-04-23 Thread Michael Lewis
>
> "unless the table is mostly marked all-visible"


Is that taken into account during planning when evaluating index scan vs
sequential scan?


Re: The query plan get all columns but I'm using only one column.

2020-04-25 Thread Michael Lewis
The example is nonsensical so I expect it is too contrived to be useful for
analyzing the actual problem.

Additionally, the total query time is under 1ms and most of it is planning
time. Use a prepared statement or do something else to reduce planning time
like reducing statistics target if that actually makes sense for your use
case.

Else, show us something much closer to the real problem.


Re: The query plan get all columns but I'm using only one column.

2020-04-29 Thread Michael Lewis
It is generally a very bad idea to turn off autovacuum. When it is causing
problems, it is likely that it needs to run more often to keep up with the
work, rather than not run at all. Certainly if it is turned off, it would
be critical to have a regularly scheduled process to vacuum analyze all
tables.

>


Re: The query plan get all columns but I'm using only one column.

2020-04-30 Thread Michael Lewis
>
> In staging environment we have disabled autovacuum since that it is a
> testing environment and the database are restored very often.
> But in production environment it is enabled autovacuum=on
>
> The weird case is that production was slow and staging environment was
> faster.
>

You haven't specified how you are doing backup and restore, but unless it
is a byte-for-byte file copy method, then there would be no bloat on the
restored staging environment so no need to vacuum. You would want to ensure
you take a new statistics sample with analyze database after restore if you
aren't.

In your production system, if your configs for autovacuum settings have not
been changed from the default parameters, it probably is not keeping up at
all if the system is moderately high in terms of update/delete
transactions. You can check pg_stat_activity for active vacuums, change the
parameter to log autovacuums longer than X to 0 and review the logs, or
check pg_stat_user_tables to see how many autovacuums/analyze have been
done since you last reset those stats.

If you have tables that are in the millions or hundreds or millions of
rows, then I would recommend decreasing autovacuum_vacuum_scale_factor from
20% down to 1% or perhaps less and similar
for autovacuum_analyze_scale_factor. You can do this on individual tables
if you have mostly small tables and just a few large ones. Else, increase
the threshold settings as well. The default value
for autovacuum_vacuum_cost_delay changed from 20ms to 2ms in PG12 so that
may also be prudent to do likewise if you upgraded to PG12 and kept your
old settings, assuming your I/O system can handle it.

Otherwise, if you have a period of time when the activity is low for your
database(s), then a last resort can be a daily scheduled vacuum analyze on
all tables. Note- do not do vacuum FULL which requires an exclusive lock on
the table to re-write it entirely. You are just looking to mark space
re-usable for future transactions, not recover the disk space back to the
OS to be consumed again if autovacuum still can't keep up. pg_repack
extension would be an option if you need to recover disk space while online.


Re: 600 million rows of data. Bad hardware or need partitioning?

2020-05-03 Thread Michael Lewis
What kinds of storage (ssd or old 5400 rpm)? What else is this machine
running?

What configs have been customized such as work_mem or random_page_cost?


Re: Duplicate WHERE condition changes performance and plan

2020-05-04 Thread Michael Lewis
Why not vacuum analyze both tables to ensure stats are up to date?

Have you customized default_statistics_target from 100? It may be that 250
would give you a more complete sample of the table without increasing the
size of the stats tables too much such that planning time increases hugely.

Do you know if any of these columns are correlated? Custom stats with
CREATE STATISTICS may help the planner make better decisions if so.

I usually hesitate to put any boolean field in an index. Do you need
the proposal.has_been_anonymised false values only, if so you could add
that to a WHERE condition on the index instead of including it as the
leading column.


Re: good book or any other resources for Postgresql

2020-05-04 Thread Michael Lewis
I don't know the others, but have enjoyed and learned a great deal from The
Art of PostgreSQL.

>


Re: AutoVacuum and growing transaction XID's

2020-05-07 Thread Michael Lewis
It is trying to do a vacuum freeze. Do you have autovacuum turned off? Any
settings changed from default related to autovacuum?

https://www.postgresql.org/docs/9.6/routine-vacuuming.html
Read 24.1.5. Preventing Transaction ID Wraparound Failures

These may also be of help-
https://info.crunchydata.com/blog/managing-transaction-id-wraparound-in-postgresql
https://www.2ndquadrant.com/en/blog/managing-freezing/

Note that you need to ensure the server gets caught up, or you risk being
locked out to prevent data corruption.


Re: AutoVacuum and growing transaction XID's

2020-05-08 Thread Michael Lewis
autovacuum_naptime being only 5 seconds seems too frequent. A lock_timeout
might be 1-5 seconds depending on your system. Usually, DDL can fail and
wait a little time rather than lock the table for minutes and have all
reads back up behind the DDL.

Given you have autovacuum_vacuum_cost_limit set to unlimited (seems very
odd), I'm not sure a manual vacuum freeze command on the tables with high
age would perform differently. Still, issuing a vacuum freeze and then
killing the autovacuum process might be worth trying.


Re: Request to help on GIS Query improvement suggestion.

2020-05-22 Thread Michael Lewis
Your indexes and operators are not compatible. You have added a btree index
on md5 function result and are not using md5 in your query, and also using
LIKE operator not one of the supported ones. I believe it might use a btree
operator (plain value, not md5 result) if you are always searching for
"string starts with  but I don't know what it ends with" but you can't
possibly use a btree index where you are putting a wild card at the front.

https://www.postgresql.org/docs/9.5/indexes-types.html

a gist index operators supported-
https://www.postgresql.org/docs/9.5/gist-builtin-opclasses.html

Here's a whole page on full text search, it would be worth a read-
https://www.postgresql.org/docs/9.5/textsearch-tables.html#TEXTSEARCH-TABLES-INDEX


Re: view reading information_schema is slow in PostgreSQL 12

2020-06-15 Thread Michael Lewis
On Fri, Jun 12, 2020 at 12:26 PM regrog  wrote:

> I'm facing performance issues migrating from postgres 10 to 12 (also from
> 11
> to 12) even with a new DB.
> Th performance difference is huge 300ms in pg10 vs 3 minutes in pg12.
>
> I have a view that abstracts the data in the database:
>
> CREATE OR REPLACE VIEW public.my_constraints
>


Assuming your DDL changes fairly seldomly, and you already have a well
structured deployment process in place for that, perhaps just change this
to a materialized view and refresh (concurrently) after any DDL gets
executed. That way, you have stats on what your view has in it and are not
subject to issues with planning the execution of the query in this view.


Re: simple query running for ever

2020-06-16 Thread Michael Lewis
On Tue, Jun 16, 2020 at 2:35 PM Nagaraj Raj  wrote:

> I wrote a simple query, and it is taking too long, not sure what is wrong
> in it, even its not giving EXPLAIN ANALYZE.
>

More context is needed. Please review-

https://wiki.postgresql.org/wiki/Slow_Query_Questions


Re: Unclamped row estimates whith OR-ed subplans

2020-06-19 Thread Michael Lewis
>
> While you're waiting, you might think about recasting the query to
> avoid the OR.  Perhaps you could do a UNION of two scans of the
> transactions table?
>

Minor note- use UNION ALL to avoid the dedupe work if you already know
those will be distinct sets, or having duplicates is fine.


Re: Sudden insert performance degradation

2020-07-13 Thread Michael Lewis
Is this an insert only table and perhaps not being picked up by autovacuum?
If so, try a manual "vacuum analyze" before/after each batch run perhaps.
You don't mention updates, but also have been adjusting fillfactor so I am
not not sure.


Re: Hstore index for full text search

2020-08-11 Thread Michael Lewis
Hash Cond: (o.courier_id = cc.id)
Filter: (((o.tracker_code)::text ~~* '%1654323%'::text) OR
((table_cus.name)::text
~~* '%1654323%'::text) OR ((au.username)::text ~~ '%1654323%'::text) OR
((o.source)::text ~~* '%1654323%'::text) OR ((o.ops -> 'shop'::text) ~~*
'%1654323%'::text) OR ((o.ops -> 'camp_code'::text) ~~* '%1654323%'::text)
OR ((city.name)::text ~~* '%1654323%'::text) OR ((co.name)::text ~~*
'%1654323%'::text) OR ((o.tr_code)::text ~~* '%1654323%'::text) OR ((o.ops
? 'shipping_company'::text) AND ((o.ops -> 'shipping_company'::text) ~~*
'%1654323%'::text)) OR ((cc.name)::text ~~* '%1654323%'::text))


All those OR conditions on different tables and fields seems like it will
be unlikely that the planner will do anything with the index you are trying
to create (for this query).

On the error, I came across discussions on dba.stackexchange.com
referencing a limit of about 1/3 of the page size (8192) for every
key because of it being a btree underneath. It could be one or more of your
keys in ops (like shop, camp_code, and shipping_company) is much longer
than those examples shown in the query.


Re: Hstore index for full text search

2020-08-11 Thread Michael Lewis
On Tue, Aug 11, 2020 at 4:46 PM Tom Lane  wrote:

> A GIN index on an hstore column only provides the ability to search for
> exact matches to hstore key strings.  There are a few bells and whistles,
> like the ability to AND or OR such conditions.  But basically it's just an
> exact-match engine, and it doesn't index the hstore's data values at all
> (which is why the implementors weren't too concerned about having a length
> limit on the index entries).  There is 0 chance of this index type being
> useful for what the OP wants to do.
>

Thanks for sharing. More like json path ops and not the full key and value.
Interesting.


> Another point is that you will only
> get an indexscan if *every* OR'd clause matches some index.  The example
> query looks sufficiently unstructured that that might be hard to ensure.
>

Does this still apply when the where clauses are on several tables and not
just one?


Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Michael Lewis
Message queue...
Are rows deleted? Are they updated once or many times? Have you adjusted
fillfactor on table or indexes? How many rows in the table currently or on
average? Is there any ordering to which rows you update?

It seems likely that one of the experts/code contributors will chime in and
explain about how locking that many rows in that many concurrent
connections means that some resource is overrun and so you are escalating
to a table lock instead of actually truly locking only the 250 rows you
wanted.

On the other hand, you say 80 cores and you are trying to increase the
number of concurrent processes well beyond that without (much) disk I/O
being involved. I wouldn't expect that to perform awesome.

Is there a chance to modify the code to permit each process to lock 1000
rows at a time and be content with 64 concurrent processes?


Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Michael Lewis
On Tue, Aug 18, 2020 at 6:22 PM Jim Jarvie  wrote:

> There is some ordering on the select [ ORDER BY q_id] so each block of 250
> is sequential-ish queue items; I just need them more or less in the order
> they were queued so as near FIFO as possible without being totally strict
> on absolute sequential order.
>
How long does each process take in total? How strict does that FIFO really
need to be when you are already doing SKIP LOCKED anyway?

Table has around 192K rows, as a row is processed it is deleted as part of
> the transaction with a commit at the end after all 250 are processed
> [partitioned table, state changes and it migrates to a different partition]
> and as the queue drops to 64K it is added to with 128K rows at a time.
>
Can you expound on the partitioning? Are all consumers of the queue always
hitting one active partition and anytime a row is processed, it always
moves to one of many? archived type partitions?

Less processes does not give the throughput required because the queue
> sends data elsewhere which has a long round trip time
>

Is that done via FDW or otherwise within the same database transaction? Are
you connecting some queue consumer application code to Postgres, select for
update, doing work on some remote system that is slow, and then coming back
and committing the DB work?

By the way, top-posting is discouraged here and partial quotes with
interspersed comments are common practice.


Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-18 Thread Michael Lewis
Also, have you checked how bloated your indexes are getting? Do you run
default autovacuum settings? Did you update to the new default 2ms cost
delay value? With a destructive queue, it would be very important to ensure
autovacuum is keeping up with the churn. Share your basic table structure
and indexes, sanitized if need be.

>


Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-20 Thread Michael Lewis
Great to hear that some of the issues are now mitigated. Though, perhaps
you actually require that ORDER BY if items are expected to be sitting in
the queue quite some time because you have incoming queue items in a burst
pattern and have to play catch up sometimes. If so, I highly suspect the
index on q_id is becoming very bloated and reindex concurrently would help.

> Partitions are list partitioned as 'incoming', 'processing', 'retry',
'ok', 'failed':

I am unclear on what purpose a "processing" status would have. Shouldn't a
row be in the incoming status & locked by select for update, until it
either gets updated to ok or failed (or left alone if retry is needed)?
What purpose do the retry and processing statuses serve? I don't understand
your full workflow to venture a guess on how you are hitting that error
regarding a row being in the wrong partition, but fewer main level
partitions and removing unneeded updates seems likely to help or resolve
the issue perhaps.

I don't know if you might have missed my last message, and the suggestion
from Laurenz to check pgstattuple.

At a high level, it seems like any needed update to the rows would result
in it being removed from the current partition and moved to another
partition. If you are doing this in a transaction block, then you could
just as well skip the select for update and just DELETE [] RETURNING from
the existing partition and insert into the new partition later (use a
select for update if you want to order the deletes*). If your transaction
fails and gets rolled back, then the delete won't have happened and the row
will get picked up by the next consumer.

Another thought is that I don't know how performant that hash partitioning
will be for select for update, particularly if that targets many partitions
potentially. Would it be feasible to match the number of partitions to the
number of consumers and actually have each of them working on one?


*
https://www.2ndquadrant.com/en/blog/what-is-select-skip-locked-for-in-postgresql-9-5/


Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-20 Thread Michael Lewis
Can you share an explain analyze for the query that does the select for
update? I wouldn't assume that partition pruning is possible at all with
hash, and it would be interesting to see how it is finding those rows.

>


Re: CPU hogged by concurrent SELECT..FOR UPDATE SKIP LOCKED

2020-08-20 Thread Michael Lewis
On Thu, Aug 20, 2020 at 4:40 PM Jim Jarvie  wrote:

> On 20-Aug.-2020 17:42, Michael Lewis wrote:
>
> Can you share an explain analyze for the query that does the select for
> update? I wouldn't assume that partition pruning is possible at all with
> hash, and it would be interesting to see how it is finding those rows.
>
> Well this got interesting  - the already moved error showed up:  Note, the
> actual process partitions are regular table partitions, these are not
> hashed.  Only the incoming and completed are hashed due to row counts at
> either end of the processing; in flight (where the issue shows up) is quite
> small:
>
> [queuedb] # explain analyze select queueid,txobject,objectid,state from
> mq.queue where (state = 'tx_active' or state='tx_fail_retryable') and
> txobject = 'ticket' limit 250 for update skip locked;
> ERROR:  40001: tuple to be locked was already moved to another partition
> due to concurrent update
> LOCATION:  heapam_tuple_lock, heapam_handler.c:405
> Time: 579.131 ms
>
That is super curious. I hope that someone will jump in with an explanation
or theory on this.

I still wonder why the move between partitions is needed though if the work
is either done (failed or successful) or not done... not started, retry
needed or in progress... it doesn't matter. It needs to get picked up by
the next process if it isn't already row locked.

>


Re: Too few rows expected by Planner on partitioned tables

2020-08-26 Thread Michael Lewis
On Wed, Aug 26, 2020, 1:37 AM Julian Wolf  wrote:

> Hi Justin,
>
> thank you very much for your help and sorry for the late answer.
>
> After testing around with your suggestions, it actually was the daterange
> type which caused all the problems. Messing around with the statistics
> value improved performance drastically but did not solve the problem. We
> decided to replace the daterange type with a BIGINT and calculate the "id"
> of the daterange by just using the BIGINT (2x 4 bytes) representation of
> the daterange. Thus, it can be transformed in both directions immutably.
>
> CREATE OR REPLACE FUNCTION to_daterange_id(daterange DATERANGE)
> RETURNS BIGINT
> IMMUTABLE
> LANGUAGE plpgsql
> AS
> $$
> BEGIN
> return (extract(EPOCH FROM lower(daterange))::BIGINT << 32) |
>extract(EPOCH FROM upper(daterange))::BIGINT;
> end;
>
> --
> CREATE OR REPLACE FUNCTION to_daterange(daterange_id BIGINT)
> RETURNS DATERANGE
> IMMUTABLE
> LANGUAGE plpgsql
> AS
> $$
> BEGIN
> RETURN daterange(to_timestamp(daterange_id >> 32)::DATE, 
> to_timestamp(daterange_id & x''::BIGINT)::DATE);
> END;
> $$;
>
>
You might want to consider changing that language declaration to SQL.

>


Re: Query performance issue

2020-09-04 Thread Michael Lewis
CREATE INDEX receiving_item_delivered_received
ON receiving_item_delivered_received USING btree ( eventtype,
replenishmenttype, serial_no, eventtime DESC );

>
More work_mem as Tomas suggests, but also, the above index should find the
candidate rows by the first two keys, and then be able to skip the sort by
reading just that portion of the index that matches

eventtype='LineItemdetailsReceived'
and replenishmenttype = 'DC2SWARRANTY'


Re: Query performance issue

2020-09-04 Thread Michael Lewis
Note- you may need to vacuum* the table to get full benefit of index only
scan by updating the visibility map. I think index only scan is skipped in
favor of just checking visibility when the visibility map is stale.

*NOT full


Re: Query performance issue

2020-09-04 Thread Michael Lewis
"Subquery Scan on rec (cost=1628601.89..1676580.92 rows=7381 width=41)
(actual time=22171.986..23549.079 rows=1236042 loops=1)" " Filter:
(rec.mpos = 1)" " Rows Removed by Filter: 228737" " Buffers: shared hit=45
read=1166951" " I/O Timings: read=29.530" " -> WindowAgg
(cost=1628601.89..1658127.45 rows=1476278 width=49) (actual
time=22171.983..23379.219 rows=1464779 loops=1)" " Buffers: shared hit=45
read=1166951" " I/O Timings: read=29.530" " -> Sort
(cost=1628601.89..1632292.58 rows=1476278 width=41) (actual
time=22171.963..22484.044 rows=1464779 loops=1)" " Sort Key:
receiving_item_delivered_received.serial_no,
receiving_item_delivered_received.eventtime DESC" " Sort Method: quicksort
Memory: 163589kB" " Buffers: shared hit=45 read=1166951" " I/O Timings:
read=29.530" " -> Gather (cost=1000.00..1477331.13 rows=1476278 width=41)
(actual time=1.296..10428.060 rows=1464779 loops=1)" " Workers Planned: 2" "
Workers Launched: 2" " Buffers: shared hit=39 read=1166951" " I/O Timings:
read=29.530" " -> Parallel Seq Scan on receiving_item_delivered_received
(cost=0.00..1328703.33 rows=615116 width=41) (actual time=1.262..10150.325
rows=488260 loops=3)" " Filter: (((COALESCE(serial_no, ''::character
varying))::text <> ''::text) AND ((eventtype)::text =
'LineItemdetailsReceived'::text) AND ((replenishmenttype)::text =
'DC2SWARRANTY'::text))" " Rows Removed by Filter: 6906258" " Buffers:
shared hit=39 read=1166951" " I/O Timings: read=29.530" "Planning Time:
0.375 ms" "Execution Time: 23617.348 ms"


That is doing a lot of reading from disk. What do you have shared_buffers
set to? I'd expect better cache hits unless it is quite low or this is a
query that differs greatly from the typical work.

Also, did you try adding the index I suggested? That lowest node has 488k
rows coming out of it after throwing away 6.9 million. I would expect an
index on only eventtype, replenishmenttype to be quite helpful. I don't
assume you have tons of rows where serial_no is null.


Re: Query performance issue

2020-09-05 Thread Michael Lewis
On Fri, Sep 4, 2020, 4:20 PM Nagaraj Raj  wrote:

> Hi Mechel,
>
> I added the index as you suggested and the planner going through the
> bitmap index scan,heap and the new planner is,
> HaOx | explain.depesz.com 
>
> HaOx | explain.depesz.com
>
> 
>
> Mem config:
>
> Aurora PostgreSQL 11.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
> 4.9.3, 64-bit
> vCPU = 64
> RAM = 512
> show shared_buffers = 355 GB
> show work_mem = 214 MB
> show maintenance_work_mem = 8363MB
> show effective_cache_size = 355 GB
>

I'm not very familiar with Aurora, but I would certainly try the explain
analyze with timing OFF and verify that the total time is similar. If the
system clock is slow to read, execution plans can be significantly slower
just because of the cost to measure each step.

That sort being so slow is perplexing. Did you do the two column or four
column index I suggested?

Obviously it depends on your use case and how much you want to tune this
specific query, but you could always try a partial index matching the where
condition and just index the other two columns to avoid the sort.


Re: Performance Issue (Not using Index when joining two tables).

2020-09-13 Thread Michael Lewis
Filter: ((("current_user"())::text <> ANY ('{wfnadmin,skipvpd}'::text[]))
AND f_sel_policy_all(vpd_key, 'CH
R_EMP_POSITION'::character varying) AND
f_sel_policy_prod_locale((cep.*)::character
varying, prod_locale_code))

This looks like some stuff for row level security perhaps. My understanding
is limited, but perhaps those restrictions are influencing the planners
access or reliance on stats.

Also, it would seem like you need the entire table since you don't have an
explicit where clause. Why would scanning an index and then also visiting
every row in the table be faster than just going directly to the table?


Re: Single column vs composite partial index

2020-09-18 Thread Michael Lewis
Index Cond: ((load_dttm)::date >= (CURRENT_DATE - 7))"

There is no need to cast the load_dttm field to a date in the query. The
plain index on the field would be usable if you skipped that. In your
example, you show creating the single column index but it isn't getting
used because of the type cast. The second index is both partial, and
multi-column. If your data statistics show that ((actv_code)::text = ANY
('{NAC,CAN,RSP,RCL}'::text[])) only 1% of the time, then it would certainly
be helpful to have a partial index if those are the rows you want to find
often and do so quickly. If the rows with those values for actv_code is
more like 75% of the total rows, then there'd be no reason to make it
partial IMO.

If you are often/constantly querying for only the last 7-7.999 days of data
based on load_dttm, I would put that as the first column of the index since
then you would be scanning a contiguous part rather than scanning 3
different parts of the composite index where actv_code = each of those
three values, and then finding the rows that are recent based on the
timestamp(tz?) field.


Re: Too many waits on extension of relation

2020-10-05 Thread Michael Lewis
What is relation 266775 of database 196511? Is
it cms_c207c1e2_0ce7_422c_aafb_77d43f61e563.cms_item or some system catalog
table?

When I search google for "ExclusiveLock on extension of relation" I find
one thread about shared_buffers being very high but not big enough to fit
the entire data in the cluster. How much ram, what is shared buffers and
what is the total size of the database(s) on that Postgres instance?

>


Re: Slow Query

2020-10-14 Thread Michael Lewis
Is there no index on thread.spool? What about notification.user? How about
message.time (without thread as a leading column). Those would all seem
very significant. Your row counts are very low to have a query perform so
badly. Work_mem could probably be increased above 4MB, but it isn't hurting
this query in particular.

My primary concern is that the query is rather chaotic at a glance. It
would be great to re-write and remove the unneeded keywords, double quotes,
totally worthless parentheses, etc. Something like the below may help you
see the crux of the query and what could be done and understand how many
rows might be coming out of those subqueries. I re-ordered some joins and
there might be syntax errors, but give it a shot once you've added the
indexes suggested above.

SELECT

spool.id,

handle.handle,

spool.name,

thread.id,

case.closed,

notification.read,

notification2.time,

message.message,

message.time,

message.author,

thread.name,

location.geo

FROM

spool

JOIN handle ON handle.id = spool.id

JOIN thread ON thread.spool = spool.id

JOIN message ON message.thread = thread.id

LEFT JOIN location ON location.id = spool.location

LEFT JOIN case ON case.id = spool.id

LEFT JOIN notification ON notification.user =
'b16690e4-a3c5-4868-945e-c2458c27a525'

AND

notification.id = (

SELECT

notification3.id

FROM

notification AS notification3

JOIN notification_thread ON notification_thread.id = notification3.id

JOIN thread AS thread2 ON thread2.id = notification_thread.thread

WHERE

thread2.spool = spool.id

AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification3.time <= '2020-09-30 16:32:38.054558'

ORDER BY

notification3.time DESC

LIMIT 1

)

LEFT JOIN notification AS notification2 ON notification2.user =
'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification2.id = (

SELECT

notification3.id

FROM

notification AS notification3

JOIN notification_thread ON notification_thread.id = notification3.id

JOIN thread AS thread2 ON thread2.id = notification_thread.thread

WHERE

thread2.spool = spool.id

AND notification3.user = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND notification3.time > '2020-09-30 16:32:38.054558'

ORDER BY

notification3.time DESC

LIMIT 1

)

WHERE

message.time = (

SELECT

MAX ( message2.time )

FROM

message AS message2

JOIN thread AS thread2 ON thread2.id = message2.thread

JOIN participant ON participant.thread = thread2.id

JOIN identity ON identity.id = participant.identity

LEFT JOIN relation ON relation.to = identity.id

AND relation.from = 'b16690e4-a3c5-4868-945e-c2458c27a525'

AND relation.manages = TRUE

WHERE

NOT message2.draft

AND ( identity.id = 'b16690e4-a3c5-4868-945e-c2458c27a525' OR NOT
relation.to IS NULL )

AND thread2.spool = spool.id

LIMIT 1

)

AND notification.id IS NOT NULL

ORDER BY

message.time DESC

LIMIT 31;


Re: Slow Query

2020-10-14 Thread Michael Lewis
Based on the execution plan, it looks like the part that takes 13 seconds
of the total 14.4 seconds is just calculating the max time used in the
where clause. Anytime I see an OR involved in a plan gone off the rails, I
always always check if re-writing the query some other way may be faster.
How's the plan for something like this?


WHERE message.time = greatest( *sub1.time*, *sub2.time* )

/* sub1.time */
(
select
MAX ( message2.time )
FROM
message AS message2
JOIN thread AS thread2 ON thread2.id = message2.thread
JOIN participant ON participant.thread = thread2.id
WHERE
NOT message2.draft
AND participant.identity = 'b16690e4-a3c5-4868-945e-c2458c27a525'
AND thread2.spool = spool.id
)

/* sub2.time */
(
select
MAX ( message2.time )
FROM
message AS message2
JOIN thread AS thread2 ON thread2.id = message2.thread
JOIN participant ON participant.thread = thread2.id
JOIN relation ON relation.to = participant.identity
AND relation.from = 'b16690e4-a3c5-4868-945e-c2458c27a525'
AND relation.manages = TRUE
WHERE
NOT message2.draft
AND thread2.spool = spool.id
)

>


Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-15 Thread Michael Lewis
What version by the way? Do you get a faster execution if you disable
sequential scan? Or set parallel workers per gather to 0? Your estimates
look decent as do cache hits, so other than caching data or upgrading
hardware, not sure what else there is to be done.

Although... you are hitting 70k blocks to read only 612k rows? Are these
job records very wide perhaps, or do you need to do some vacuuming? Perhaps
autovacuum is not keeping up and you could use some repacking or vacuum
full if/when you can afford downtime. If you create a temp table copy of
the job table, how does the size compare to the live table?


Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-19 Thread Michael Lewis
Reply to the group, not just me please. Btw, when you do reply to the
group, it is best practice on these lists to reply in-line and not just
reply on top with all prior messages quoted.

On Sun, Oct 18, 2020 at 3:23 AM aditya desai  wrote:

> I tried vacuum full and execution time came down to half.
>
Great to hear.


> However, it still consumes CPU. Setting parallel workers per gather to 0
> did not help much.
>
You didn't answer all of my questions, particularly about disabling
sequential scan. If you still have the default random_page_cost of 4, it
might be that 1.5 allows better estimates for cost on index (random) vs
sequential scan of a table.

Laurenz is a brilliant guy. I would implement the indexes he suggests if
you don't have them already and report back. If the indexes don't get used,
try set enable_seqscan = false; before the query and if it is way faster,
then reduce random_page_cost to maybe 1-2 depending how your overall cache
hit ratio is across the system.


> Auto vacuuming is catching up just fine. No issues in that area.
>
If the time came down by half after 'vacuum full', I would question that
statement.


> Temp table size is less that original tables without indexes.
>
Significantly less would indicate the regular table still being bloated I
think. Maybe someone else will suggest otherwise.


> Does this mean we need to upgrade the hardware? Also by caching data , do
> you mean caching at application side(microservices side) ? Or on postgres
> side? I tried pg_prewarm, it did not help much.
>
I can't say about hardware. Until you have exhausted options like configs
and indexing, spending more money forever onwards seems premature. I meant
pre-aggregated data, wherever it makes sense to do that. I wouldn't expect
pg_prewarm to do a ton since you already show high cache hits.


> It is actually the CPU consumption which is the issue. Query is fast
> otherwise.
>
Sure, but that is a symptom of reading and processing a lot of data.

>


  1   2   >