Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-20 Thread aditya desai
Hi Michael,
Will follow standard practice going forward. We are in the process of
rebuilding the PST environment equivalent to Prod where these Load tests
were done. I will implement all these suggestions on that environment and
reply back. Sincere apologies for the delay.

Regards,
Aditya.

On Mon, Oct 19, 2020 at 9:50 PM Michael Lewis  wrote:

> Reply to the group, not just me please. Btw, when you do reply to the
> group, it is best practice on these lists to reply in-line and not just
> reply on top with all prior messages quoted.
>
> On Sun, Oct 18, 2020 at 3:23 AM aditya desai  wrote:
>
>> I tried vacuum full and execution time came down to half.
>>
> Great to hear.
>
>
>> However, it still consumes CPU. Setting parallel workers per gather to 0
>> did not help much.
>>
> You didn't answer all of my questions, particularly about disabling
> sequential scan. If you still have the default random_page_cost of 4, it
> might be that 1.5 allows better estimates for cost on index (random) vs
> sequential scan of a table.
>
> Laurenz is a brilliant guy. I would implement the indexes he suggests if
> you don't have them already and report back. If the indexes don't get used,
> try set enable_seqscan = false; before the query and if it is way faster,
> then reduce random_page_cost to maybe 1-2 depending how your overall cache
> hit ratio is across the system.
>
>
>> Auto vacuuming is catching up just fine. No issues in that area.
>>
> If the time came down by half after 'vacuum full', I would question that
> statement.
>
>
>> Temp table size is less that original tables without indexes.
>>
> Significantly less would indicate the regular table still being bloated I
> think. Maybe someone else will suggest otherwise.
>
>
>> Does this mean we need to upgrade the hardware? Also by caching data , do
>> you mean caching at application side(microservices side) ? Or on postgres
>> side? I tried pg_prewarm, it did not help much.
>>
> I can't say about hardware. Until you have exhausted options like configs
> and indexing, spending more money forever onwards seems premature. I meant
> pre-aggregated data, wherever it makes sense to do that. I wouldn't expect
> pg_prewarm to do a ton since you already show high cache hits.
>
>
>> It is actually the CPU consumption which is the issue. Query is fast
>> otherwise.
>>
> Sure, but that is a symptom of reading and processing a lot of data.
>
>>


Query Performance / Planner estimate off

2020-10-20 Thread Mats Julian Olsen
Dear Postgres community,

I'm looking for some help to manage queries against two large tables.

Context:
We run a relatively large postgresql instance (5TB, 32 vCPU, 120GB RAM)
with a hybrid transactional/analytical workload. Data is written in batches
every 15 seconds or so, and the all queryable tables are append-only (we
never update or delete). Our users can run analytical queries on top of
these tables.

We recently came across a series of troublesome queries one of which I'll
dive into here.

Please see the following gist for both the query we run and the \d+ output:
https://gist.github.com/mewwts/9f11ae5e6a5951593b8999559f5418cf.

The tables in question are:
- `ethereum.transactions`: 833M rows, partitioned, 171M rows after WHERE
- `uniswap_v2."Pair_evt_Swap": 12M rows, not partitioned, 12M rows after
WHERE

The crux of our issue is that the query planner chooses a nested loop join
for this query. Essentially making this query (and other queries) take a
very long time to complete. In contrast, by toggling `enable_nestloop` and
`enable_seqscan` off we can take the total runtime down from 16 minutes to
2 minutes.

1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
3) enable_nestloop=off; enable_seqscan=off (2 min):
https://explain.depesz.com/s/0WXx

How can I get Postgres not to loop over 12M rows?

Let me know if there is anything I left out here that would be useful for
further debugging.

-- 
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com


Re: Query Performance / Planner estimate off

2020-10-20 Thread David Rowley
On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen  wrote:
>
> The crux of our issue is that the query planner chooses a nested loop join 
> for this query. Essentially making this query (and other queries) take a very 
> long time to complete. In contrast, by toggling `enable_nestloop` and 
> `enable_seqscan` off we can take the total runtime down from 16 minutes to 2 
> minutes.
>
> 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
> 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
> 3) enable_nestloop=off; enable_seqscan=off (2 min): 
> https://explain.depesz.com/s/0WXx
>
> How can I get Postgres not to loop over 12M rows?

You'll likely want to look at what random_page_cost is set to. If the
planner is preferring nested loops then it may be too low.  You'll
also want to see if effective_cache_size is set to something
realistic.  Higher values of that will prefer nested loops like this.

You may also want to reduce max_parallel_workers_per_gather.  It looks
like you're not getting your parallel workers as often as you'd like.
If the planner chooses a plan thinking it's going to get some workers
and gets none, then that plan may be inferior the one that the planner
would have chosen if it had known the workers would be unavailable.

> Let me know if there is anything I left out here that would be useful for 
> further debugging.

select name,setting from pg_Settings where category like 'Query
Tuning%' and source <> 'default';
select version();

would be useful.

David




Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Julian Olsen
On Tue, Oct 20, 2020 at 9:50 AM David Rowley  wrote:

> On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen 
> wrote:
> >
> > The crux of our issue is that the query planner chooses a nested loop
> join for this query. Essentially making this query (and other queries) take
> a very long time to complete. In contrast, by toggling `enable_nestloop`
> and `enable_seqscan` off we can take the total runtime down from 16 minutes
> to 2 minutes.
> >
> > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
> > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
> > 3) enable_nestloop=off; enable_seqscan=off (2 min):
> https://explain.depesz.com/s/0WXx
> >
> > How can I get Postgres not to loop over 12M rows?
>
> You'll likely want to look at what random_page_cost is set to. If the
> planner is preferring nested loops then it may be too low.  You'll
> also want to see if effective_cache_size is set to something
> realistic.  Higher values of that will prefer nested loops like this.
>

random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in the
gist). random_page_cost may be too low?


> You may also want to reduce max_parallel_workers_per_gather.  It looks
> like you're not getting your parallel workers as often as you'd like.
> If the planner chooses a plan thinking it's going to get some workers
> and gets none, then that plan may be inferior the one that the planner
> would have chosen if it had known the workers would be unavailable.
>

Interesting, here are the values for those:
max_parallel_workers = 8
max_parallel_workers_per_gather = 4


>
> > Let me know if there is anything I left out here that would be useful
> for further debugging.
>
> select name,setting from pg_Settings where category like 'Query
> Tuning%' and source <> 'default';
> select version();
>

default_statistics_target = 500
effective_cache_size = 7864320
random_page_cost = 1.1

PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu,
compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit

>
> would be useful.
>
> David
>

Thanks David, see above for more information.

-- 
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com


Re: Query Performance / Planner estimate off

2020-10-20 Thread Pavel Stehule
út 20. 10. 2020 v 11:59 odesílatel Mats Julian Olsen 
napsal:

> On Tue, Oct 20, 2020 at 9:50 AM David Rowley  wrote:
>
>> On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen 
>> wrote:
>> >
>> > The crux of our issue is that the query planner chooses a nested loop
>> join for this query. Essentially making this query (and other queries) take
>> a very long time to complete. In contrast, by toggling `enable_nestloop`
>> and `enable_seqscan` off we can take the total runtime down from 16 minutes
>> to 2 minutes.
>> >
>> > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
>> > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
>> > 3) enable_nestloop=off; enable_seqscan=off (2 min):
>> https://explain.depesz.com/s/0WXx
>> >
>> > How can I get Postgres not to loop over 12M rows?
>>
>> You'll likely want to look at what random_page_cost is set to. If the
>> planner is preferring nested loops then it may be too low.  You'll
>> also want to see if effective_cache_size is set to something
>> realistic.  Higher values of that will prefer nested loops like this.
>>
>
> random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in the
> gist). random_page_cost may be too low?
>

random_page_cost 2 is safer - the value 1.5 is a little bit aggressive for
me.


>
>> You may also want to reduce max_parallel_workers_per_gather.  It looks
>> like you're not getting your parallel workers as often as you'd like.
>> If the planner chooses a plan thinking it's going to get some workers
>> and gets none, then that plan may be inferior the one that the planner
>> would have chosen if it had known the workers would be unavailable.
>>
>
> Interesting, here are the values for those:
> max_parallel_workers = 8
> max_parallel_workers_per_gather = 4
>
>
>>
>> > Let me know if there is anything I left out here that would be useful
>> for further debugging.
>>
>> select name,setting from pg_Settings where category like 'Query
>> Tuning%' and source <> 'default';
>> select version();
>>
>
> default_statistics_target = 500
> effective_cache_size = 7864320
> random_page_cost = 1.1
>
> PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu,
> compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit
>
>>
>> would be useful.
>>
>> David
>>
>
> Thanks David, see above for more information.
>
> --
> Mats
> CTO @ Dune Analytics
> We're hiring: https://careers.duneanalytics.com
>


Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Julian Olsen
On Tue, Oct 20, 2020 at 10:50 AM Pavel Stehule 
wrote:

>
>
> út 20. 10. 2020 v 11:59 odesílatel Mats Julian Olsen <
> [email protected]> napsal:
>
>> On Tue, Oct 20, 2020 at 9:50 AM David Rowley 
>> wrote:
>>
>>> On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen 
>>> wrote:
>>> >
>>> > The crux of our issue is that the query planner chooses a nested loop
>>> join for this query. Essentially making this query (and other queries) take
>>> a very long time to complete. In contrast, by toggling `enable_nestloop`
>>> and `enable_seqscan` off we can take the total runtime down from 16 minutes
>>> to 2 minutes.
>>> >
>>> > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
>>> > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
>>> > 3) enable_nestloop=off; enable_seqscan=off (2 min):
>>> https://explain.depesz.com/s/0WXx
>>> >
>>> > How can I get Postgres not to loop over 12M rows?
>>>
>>> You'll likely want to look at what random_page_cost is set to. If the
>>> planner is preferring nested loops then it may be too low.  You'll
>>> also want to see if effective_cache_size is set to something
>>> realistic.  Higher values of that will prefer nested loops like this.
>>>
>>
>> random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in the
>> gist). random_page_cost may be too low?
>>
>
> random_page_cost 2 is safer - the value 1.5 is a little bit aggressive for
> me.
>

Thanks Pavel. I tried changing random_page_cost from 1.1 to 2, to 3... all
the way up to 10. All values resulted in the same query plan, except for
10, which then executed a parallel hash join (however with sequential
scans) https://explain.depesz.com/s/Srcb.

10 seems like a way too high value for random_page_cost though?


>
>>
>>> You may also want to reduce max_parallel_workers_per_gather.  It looks
>>> like you're not getting your parallel workers as often as you'd like.
>>> If the planner chooses a plan thinking it's going to get some workers
>>> and gets none, then that plan may be inferior the one that the planner
>>> would have chosen if it had known the workers would be unavailable.
>>>
>>
>> Interesting, here are the values for those:
>> max_parallel_workers = 8
>> max_parallel_workers_per_gather = 4
>>
>>
>>>
>>> > Let me know if there is anything I left out here that would be useful
>>> for further debugging.
>>>
>>> select name,setting from pg_Settings where category like 'Query
>>> Tuning%' and source <> 'default';
>>> select version();
>>>
>>
>> default_statistics_target = 500
>> effective_cache_size = 7864320
>> random_page_cost = 1.1
>>
>> PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu,
>> compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit
>>
>>>
>>> would be useful.
>>>
>>> David
>>>
>>
>> Thanks David, see above for more information.
>>
>> --
>> Mats
>> CTO @ Dune Analytics
>> We're hiring: https://careers.duneanalytics.com
>>
>

-- 
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com


Re: Query Performance / Planner estimate off

2020-10-20 Thread Pavel Stehule
út 20. 10. 2020 v 13:09 odesílatel Mats Julian Olsen 
napsal:

>
>
> On Tue, Oct 20, 2020 at 10:50 AM Pavel Stehule 
> wrote:
>
>>
>>
>> út 20. 10. 2020 v 11:59 odesílatel Mats Julian Olsen <
>> [email protected]> napsal:
>>
>>> On Tue, Oct 20, 2020 at 9:50 AM David Rowley 
>>> wrote:
>>>
 On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen 
 wrote:
 >
 > The crux of our issue is that the query planner chooses a nested loop
 join for this query. Essentially making this query (and other queries) take
 a very long time to complete. In contrast, by toggling `enable_nestloop`
 and `enable_seqscan` off we can take the total runtime down from 16 minutes
 to 2 minutes.
 >
 > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
 > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
 > 3) enable_nestloop=off; enable_seqscan=off (2 min):
 https://explain.depesz.com/s/0WXx
 >
 > How can I get Postgres not to loop over 12M rows?

 You'll likely want to look at what random_page_cost is set to. If the
 planner is preferring nested loops then it may be too low.  You'll
 also want to see if effective_cache_size is set to something
 realistic.  Higher values of that will prefer nested loops like this.

>>>
>>> random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in
>>> the gist). random_page_cost may be too low?
>>>
>>
>> random_page_cost 2 is safer - the value 1.5 is a little bit aggressive
>> for me.
>>
>
> Thanks Pavel. I tried changing random_page_cost from 1.1 to 2, to 3... all
> the way up to 10. All values resulted in the same query plan, except for
> 10, which then executed a parallel hash join (however with sequential
> scans) https://explain.depesz.com/s/Srcb.
>
> 10 seems like a way too high value for random_page_cost though?
>

it is not usual, but I know about analytics cases where is this value. But
maybe  effective_cache_size is too high.

>
>
>>
>>>
 You may also want to reduce max_parallel_workers_per_gather.  It looks
 like you're not getting your parallel workers as often as you'd like.
 If the planner chooses a plan thinking it's going to get some workers
 and gets none, then that plan may be inferior the one that the planner
 would have chosen if it had known the workers would be unavailable.

>>>
>>> Interesting, here are the values for those:
>>> max_parallel_workers = 8
>>> max_parallel_workers_per_gather = 4
>>>
>>>

 > Let me know if there is anything I left out here that would be useful
 for further debugging.

 select name,setting from pg_Settings where category like 'Query
 Tuning%' and source <> 'default';
 select version();

>>>
>>> default_statistics_target = 500
>>> effective_cache_size = 7864320
>>> random_page_cost = 1.1
>>>
>>> PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu,
>>> compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit
>>>

 would be useful.

 David

>>>
>>> Thanks David, see above for more information.
>>>
>>> --
>>> Mats
>>> CTO @ Dune Analytics
>>> We're hiring: https://careers.duneanalytics.com
>>>
>>
>
> --
> Mats
> CTO @ Dune Analytics
> We're hiring: https://careers.duneanalytics.com
>


Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Julian Olsen
On Tue, Oct 20, 2020 at 11:16 AM Pavel Stehule 
wrote:

>
>
> út 20. 10. 2020 v 13:09 odesílatel Mats Julian Olsen <
> [email protected]> napsal:
>
>>
>>
>> On Tue, Oct 20, 2020 at 10:50 AM Pavel Stehule 
>> wrote:
>>
>>>
>>>
>>> út 20. 10. 2020 v 11:59 odesílatel Mats Julian Olsen <
>>> [email protected]> napsal:
>>>
 On Tue, Oct 20, 2020 at 9:50 AM David Rowley 
 wrote:

> On Tue, 20 Oct 2020 at 22:38, Mats Julian Olsen <
> [email protected]> wrote:
> >
> > The crux of our issue is that the query planner chooses a nested
> loop join for this query. Essentially making this query (and other 
> queries)
> take a very long time to complete. In contrast, by toggling
> `enable_nestloop` and `enable_seqscan` off we can take the total runtime
> down from 16 minutes to 2 minutes.
> >
> > 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
> > 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
> > 3) enable_nestloop=off; enable_seqscan=off (2 min):
> https://explain.depesz.com/s/0WXx
> >
> > How can I get Postgres not to loop over 12M rows?
>
> You'll likely want to look at what random_page_cost is set to. If the
> planner is preferring nested loops then it may be too low.  You'll
> also want to see if effective_cache_size is set to something
> realistic.  Higher values of that will prefer nested loops like this.
>

 random_page_cost is 1.1 and effective_cache_size is '60GB' (listed in
 the gist). random_page_cost may be too low?

>>>
>>> random_page_cost 2 is safer - the value 1.5 is a little bit aggressive
>>> for me.
>>>
>>
>> Thanks Pavel. I tried changing random_page_cost from 1.1 to 2, to 3...
>> all the way up to 10. All values resulted in the same query plan, except
>> for 10, which then executed a parallel hash join (however with sequential
>> scans) https://explain.depesz.com/s/Srcb.
>>
>> 10 seems like a way too high value for random_page_cost though?
>>
>
> it is not usual, but I know about analytics cases where is this value. But
> maybe  effective_cache_size is too high.
>

Changing the effective_cache_size from 10GB up to 60GB does not affect the
Nested Loop-part of this query plan. It does alter the inner part of a loop
from sequential (low cache) to index scans (high cache).


>
>>
>>>

> You may also want to reduce max_parallel_workers_per_gather.  It looks
> like you're not getting your parallel workers as often as you'd like.
> If the planner chooses a plan thinking it's going to get some workers
> and gets none, then that plan may be inferior the one that the planner
> would have chosen if it had known the workers would be unavailable.
>

 Interesting, here are the values for those:
 max_parallel_workers = 8
 max_parallel_workers_per_gather = 4


>
> > Let me know if there is anything I left out here that would be
> useful for further debugging.
>
> select name,setting from pg_Settings where category like 'Query
> Tuning%' and source <> 'default';
> select version();
>

 default_statistics_target = 500
 effective_cache_size = 7864320
 random_page_cost = 1.1

 PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu,
 compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit

>
> would be useful.
>
> David
>

 Thanks David, see above for more information.

 --
 Mats
 CTO @ Dune Analytics
 We're hiring: https://careers.duneanalytics.com

>>>
>>
>> --
>> Mats
>> CTO @ Dune Analytics
>> We're hiring: https://careers.duneanalytics.com
>>
>

-- 
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com


Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-20 Thread aditya desai
Hi Laurenz,
I created

On Fri, Oct 16, 2020 at 2:06 PM Laurenz Albe 
wrote:

> On Thu, 2020-10-15 at 20:34 +0530, aditya desai wrote:
> > Below query always shows up on top in the CPU matrix. Also despite
> having indexes it does sequential scans
> > (probably because WHERE condition satisfies almost all of the data from
> table). This query
> > runs on the default landing page in application and needs to fetch
> records in less that 100 ms
> >  without consuming too much CPU.
> >
> >  Any opinions? Table is very huge and due to referential identity and
> business requirements we could not
> >  implement partitioning as well.
> >
> > There is index on (countrycode,facilitycode,jobstartdatetime)
> >
> > explain (analyze,buffers) with JobCount as ( select
> jobstatuscode,count(1) stat_count from job j
> >  where 1=1 and j.countrycode = 'TH'
> > and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> >  and ((j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
> 00:00:00' ) or j.jobstartdatetime IS NULL )  group by j.jobstatuscode)
> >  select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount
> jc right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;
> >
> >   QUERY PLAN
> >
> >  Hash Right Join  (cost=98845.93..98846.10 rows=10 width=12) (actual
> time=1314.809..1314.849 rows=10 loops=1)
> >  ->  Parallel Seq Scan on job j
> (cost=0.00..96837.93 rows=200963 width=4) (actual time=13.010..1144.434
> rows=163200 loops=3)
> >Filter: (((countrycode)::text =
> 'TH'::text) AND (((jobstartdatetime >= '2020-08-01 00:00:00'::timestamp
> without time zone) AND (jobst
> > artdatetime <= '2020-09-30 00:00:00'::timestamp without time zone)) OR
> (jobstartdatetime IS NULL)) AND ((facilitycode)::text = ANY
> ('{THNPM1,THPRK1,THCNT1,THSPN1,THKRI1
> > ,THPKN1,THSBI1,THUTG1,THLRI1,THSRI1,THSUR1,THSKM1}'::text[])))
> >Rows Removed by Filter: 449035
> >Buffers: shared hit=60086 read=11834
> >I/O Timings: read=59.194
> >
>
> You should rewrite the subquery as a UNION to avoid the OR:
>
>   ... WHERE j.countrycode = 'TH'
> and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime between '2020-08-01 00:00:00' and
> '2020-09-30 00:00:00'
>
> and
>
>   ... WHERE j.countrycode = 'TH'
> and j.facilitycode in
> ('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
> and j.jobstartdatetime IS NULL
>
> These indexes could speed up the resulting query:
>
>   CREATE INDEX ON job (countrycode, facilitycode);
>   CREATE INDEX ON job (countrycode, jobstartdatetime);
>   CREATE INDEX ON job (countrycode, facilitycode) WHERE jobstartdaytime IS
> NULL;
>

I created the indexes you suggested and changed the query with the UNION
operator. Please see explain plan below. Performance of the query(execution
time has improved mostly because I ran vacuum full). Cost of the query is
still high.This is Dev envrionment and has 2 vCPU and 8 GB RAM.

explain (analyze,buffers) with JobCount as ( (select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
00:00:00'group by j.jobstatuscode) UNION (select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime is null  group by j.jobstatuscode))
lmp_delivery_jobs->  select js.jobstatuscode,COALESCE(stat_count,0)
stat_count from JobCount jc right outer join jobstatus js on
jc.jobstatuscode=js.jobstatuscode;

   QUERY PLAN


-
 Hash Right Join  (cost=79010.89..79011.19 rows=10 width=12) (actual
time=444.241..444.256 rows=10 loops=1)
   Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
   Buffers: shared hit=8560
   CTE jobcount
 ->  HashAggregate  (cost=79002.35..79002.48 rows=13 width=24) (actual
time=444.211..444.213 rows=6 loops=1)
   Group Key: j.jobstatuscode, (count(1))
   Buffers: shared hit=8558
   ->  Append  (cost=78959.64..79002.28 rows=13 width=24) (actual
time=444.081..444.202 rows=6 loops=1)

Re: CPU Consuming query. Sequential scan despite indexing.

2020-10-20 Thread aditya desai
On Mon, Oct 19, 2020 at 9:50 PM Michael Lewis  wrote:

> Reply to the group, not just me please. Btw, when you do reply to the
> group, it is best practice on these lists to reply in-line and not just
> reply on top with all prior messages quoted.
>

Hi  Michael,
Please see below inline response. I tried all this on Dev env 2 vCPU and 8
GB RAM. Still waiting for the PST environment :( with better configuration.

>
> On Sun, Oct 18, 2020 at 3:23 AM aditya desai  wrote:
>
>> I tried vacuum full and execution time came down to half.
>>
> Great to hear.
>
>
>> However, it still consumes CPU. Setting parallel workers per gather to 0
>> did not help much.
>>
> You didn't answer all of my questions, particularly about disabling
> sequential scan. If you still have the default random_page_cost of 4, it
> might be that 1.5 allows better estimates for cost on index (random) vs
> sequential scan of a table.
>

Please see the next inline answer.

>
> Laurenz is a brilliant guy. I would implement the indexes he suggests if
> you don't have them already and report back. If the indexes don't get used,
> try set enable_seqscan = false; before the query and if it is way faster,
> then reduce random_page_cost to maybe 1-2 depending how your overall cache
> hit ratio is across the system.
>

Query plan with enable_seqscan=off , Random page cost=1. With this
execution time and cost of query is almost less than half compared to
original settings. Also used the suggestions given by Laurenze. 1. Made use
of UINON operator and created indexes.

lmp_delivery_jobs=> explain (analyze,buffers) with JobCount as ( (select
jobstatuscode,count(1) stat_count from job j where 1=1 and j.countrycode =
'TH'   and j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime  between '2020-08-01 00:00:00' and '2020-09-30
00:00:00'group by j.jobstatuscode) UNION (select jobstatuscode,count(1)
stat_count from job j where 1=1 and j.countrycode = 'TH'   and
j.facilitycode in
('THNPM1','THPRK1','THCNT1','THSPN1','THKRI1','THPKN1','THSBI1','THUTG1','THLRI1','THSRI1','THSUR1','THSKM1')
and j.jobstartdatetime is null  group by j.jobstatuscode))
 select js.jobstatuscode,COALESCE(stat_count,0) stat_count from JobCount jc
right outer join jobstatus js on jc.jobstatuscode=js.jobstatuscode;

QUERY PLAN


---
 Hash Right Join  (cost=68652.52..68652.76 rows=10 width=12) (actual
time=676.477..676.495 rows=10 loops=1)
   Hash Cond: ((jc.jobstatuscode)::text = (js.jobstatuscode)::text)
   Buffers: shared hit=11897
   CTE jobcount
 ->  HashAggregate  (cost=68650.01..68650.11 rows=10 width=24) (actual
time=676.451..676.454 rows=8 loops=1)
   Group Key: j.jobstatuscode, (count(1))
   Buffers: shared hit=11895
   ->  Append  (cost=68645.89..68649.96 rows=10 width=24) (actual
time=676.346..676.441 rows=8 loops=1)
 Buffers: shared hit=11895
 ->  Finalize GroupAggregate  (cost=68645.89..68648.17
rows=9 width=12) (actual time=676.345..676.379 rows=8 loops=1)
   Group Key: j.jobstatuscode
   Buffers: shared hit=11889
   ->  Gather Merge  (cost=68645.89..68647.99 rows=18
width=12) (actual time=676.330..676.403 rows=24 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=29067 read=1
 I/O Timings: read=0.038
 ->  Sort  (cost=67645.87..67645.89 rows=9
width=12) (actual time=669.544..669.548 rows=8 loops=3)
   Sort Key: j.jobstatuscode
   Sort Method: quicksort  Memory: 25kB
   Worker 0:  Sort Method: quicksort
Memory: 25kB
   Worker 1:  Sort Method: quicksort
Memory: 25kB
   Buffers: shared hit=29067 read=1
   I/O Timings: read=0.038
   ->  Partial HashAggregate
(cost=67645.63..67645.72 rows=9 width=12) (actual time=669.506..669.511
rows=8 loops=3)
 Group Key: j.jobstatuscode
 Buffers: shared hit=29051 read=1
 I/O Timings: read=0.038
 ->  Parallel Index Scan using
job_list_test1 on job j  (cost=0.43..66135.88 rows=301950 width=4) (actual
time=0.040..442.373 ro
ws=244800 loops=3)
   Index Cond:
(((countrycode)::text = 'TH'::text) AND ((fac

Re: Query Performance / Planner estimate off

2020-10-20 Thread Victor Yegorov
вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen :

> I'm looking for some help to manage queries against two large tables.
>

Can you tell the version you're running currently and the output of this
query, please?

select name,setting,source from pg_settings where source not in
('default','override');

-- 
Victor Yegorov


Re: Query Performance / Planner estimate off

2020-10-20 Thread Victor Yegorov
вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen :

> I'm looking for some help to manage queries against two large tables.
>

Also, can you enable `track_io_timing` (no restart required) and provide
output of `EXPLAIN (analyze, buffers, settings)` for all 3 variants,
please?
(I assume you're on 12+.)

-- 
Victor Yegorov


Re: Query Performance / Planner estimate off

2020-10-20 Thread Sushant Pawar
Looping in the main group ID.

Regards
Sushant

On Tue, Oct 20, 2020 at 6:49 PM Sushant Pawar  wrote:

>
> On Tue, Oct 20, 2020 at 3:08 PM Mats Julian Olsen 
> wrote:
>
>> Dear Postgres community,
>>
>> I'm looking for some help to manage queries against two large tables.
>>
>> Context:
>> We run a relatively large postgresql instance (5TB, 32 vCPU, 120GB RAM)
>> with a hybrid transactional/analytical workload. Data is written in batches
>> every 15 seconds or so, and the all queryable tables are append-only (we
>> never update or delete). Our users can run analytical queries on top of
>> these tables.
>>
>> We recently came across a series of troublesome queries one of which I'll
>> dive into here.
>>
>> Please see the following gist for both the query we run and the \d+
>> output: https://gist.github.com/mewwts/9f11ae5e6a5951593b8999559f5418cf.
>>
>> The tables in question are:
>> - `ethereum.transactions`: 833M rows, partitioned, 171M rows after WHERE
>> - `uniswap_v2."Pair_evt_Swap": 12M rows, not partitioned, 12M rows after
>> WHERE
>>
>> The crux of our issue is that the query planner chooses a nested loop
>> join for this query. Essentially making this query (and other queries) take
>> a very long time to complete. In contrast, by toggling `enable_nestloop`
>> and `enable_seqscan` off we can take the total runtime down from 16 minutes
>> to 2 minutes.
>>
>> 1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR
>> 2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK
>> 3) enable_nestloop=off; enable_seqscan=off (2 min):
>> https://explain.depesz.com/s/0WXx
>>
>
> The cost of a query while using the default Vanila plan is very less
> compared to the 3rd plan with nested loop and seqscan  being set to off.
> As the JIT is enabled, it seems the planner tries to select the plan with
> the least cost and going for the plan which is taking more time of
> execution. Can you try running this query with JIT=off in the session and
> see if it selects the plan with the least time for execution?
>
>>
>> How can I get Postgres not to loop over 12M rows?
>>
>> Let me know if there is anything I left out here that would be useful for
>> further debugging.
>>
>> --
>> Regards
>>
> Sushant
>


Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Olsen


On 10/20/20 3:04 PM, Victor Yegorov wrote:
вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen >:


I'm looking for some help to manage queries against two large tables.


Can you tell the version you're running currently and the output of 
this query, please?


    select name,setting,source from pg_settings where source not in 
('default','override');


Running "PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on 
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 
20191008, 64-bit"


Updated the gist to include the results forom pg_settings. Here's the 
direct link 
https://gist.githubusercontent.com/mewwts/9f11ae5e6a5951593b8999559f5418cf/raw/e5dee48680e04570bec4e9a816fa009da34f/pg_settings




Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Olsen

On 10/20/20 11:37 AM, Mats Julian Olsen wrote:


Dear Postgres community,

I'm looking for some help to manage queries against two large tables.

Context:
We run a relatively large postgresql instance (5TB, 32 vCPU, 120GB 
RAM) with a hybrid transactional/analytical workload. Data is written 
in batches every 15 seconds or so, and the all queryable tables are 
append-only (we never update or delete). Our users can run analytical 
queries on top of these tables.


We recently came across a series of troublesome queries one of which 
I'll dive into here.


Please see the following gist for both the query we run and the \d+ 
output: 
https://gist.github.com/mewwts/9f11ae5e6a5951593b8999559f5418cf 
.


The tables in question are:
- `ethereum.transactions`: 833M rows, partitioned, 171M rows after WHERE
- `uniswap_v2."Pair_evt_Swap": 12M rows, not partitioned, 12M rows 
after WHERE


The query plans I submitted was querying the table 
`uniswap_v2."Pair_evt_Mint"`which has 560k rows before and after WHERE. 
Also not partitioned. Apologies for the inconsistency, but as I 
mentioned the same performance problem holds when using 
`uniswap_v2."Pair_evt_Swap" (even worse due to it's size).




The crux of our issue is that the query planner chooses a nested loop 
join for this query. Essentially making this query (and other queries) 
take a very long time to complete. In contrast, by toggling 
`enable_nestloop` and `enable_seqscan` off we can take the total 
runtime down from 16 minutes to 2 minutes.


1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR 

2) enable_nestloop=off (4 min): https://explain.depesz.com/s/buKK 

3) enable_nestloop=off; enable_seqscan=off (2 min): 
https://explain.depesz.com/s/0WXx 


How can I get Postgres not to loop over 12M rows?

Let me know if there is anything I left out here that would be useful 
for further debugging.


--
Mats
CTO @ Dune Analytics
We're hiring: https://careers.duneanalytics.com 



Re: Query Performance / Planner estimate off

2020-10-20 Thread Victor Yegorov
вт, 20 окт. 2020 г. в 16:50, Mats Olsen :

> On 10/20/20 3:04 PM, Victor Yegorov wrote:
>
> вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen :
>
>> I'm looking for some help to manage queries against two large tables.
>>
>
> Can you tell the version you're running currently and the output of this
> query, please?
>
> select name,setting,source from pg_settings where source not in
> ('default','override');
>
> Running "PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on
> x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1
> 20191008, 64-bit"
>
> Updated the gist to include the results forom pg_settings. Here's the
> direct link
> https://gist.githubusercontent.com/mewwts/9f11ae5e6a5951593b8999559f5418cf/raw/e5dee48680e04570bec4e9a816fa009da34f/pg_settings
>
It looks like indexes currently chosen by the planner don't quite fit your
query.

I would create the following index (if it's possible to update schema):
   ON "uniswap_v2.Pair_evt_Mint" (evt_tx_hash, evt_block_time)

Same for the second table, looks like
  ON "ethereum.transactions" (hash, block_time)
is a better fit for your query. In fact, I do not think
`transactions_block_number_time` index is used frequently, 'cos second
column of the index is a partitioning key.

Currently planner wants to go via indexes 'cos you've made random access
really cheap compared to sequential one (and your findings shows this).
Perhaps on a NVMe disks this could work, but in your case you need to find
the real bottleneck (therefore I asked for buffers).

I would set `random_page_cost` to a 2.5 at least with your numbers. Also, I
would check DB and indexes for bloat (just a guess now, 'cos your plans
miss buffers figures).


-- 
Victor Yegorov


Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Olsen


On 10/20/20 3:40 PM, Sushant Pawar wrote:

Looping in the main group ID.

Regards
Sushant

On Tue, Oct 20, 2020 at 6:49 PM Sushant Pawar > wrote:



On Tue, Oct 20, 2020 at 3:08 PM Mats Julian Olsen
mailto:[email protected]>> wrote:

Dear Postgres community,

I'm looking for some help to manage queries against two large
tables.

Context:
We run a relatively large postgresql instance (5TB, 32 vCPU,
120GB RAM) with a hybrid transactional/analytical workload.
Data is written in batches every 15 seconds or so, and the all
queryable tables are append-only (we never update or delete).
Our users can run analytical queries on top of these tables.

We recently came across a series of troublesome queries one of
which I'll dive into here.

Please see the following gist for both the query we run and
the \d+ output:
https://gist.github.com/mewwts/9f11ae5e6a5951593b8999559f5418cf
.


The tables in question are:
- `ethereum.transactions`: 833M rows, partitioned, 171M rows
after WHERE
- `uniswap_v2."Pair_evt_Swap": 12M rows, not partitioned, 12M
rows after WHERE

The crux of our issue is that the query planner chooses a
nested loop join for this query. Essentially making this query
(and other queries) take a very long time to complete. In
contrast, by toggling `enable_nestloop` and `enable_seqscan`
off we can take the total runtime down from 16 minutes to 2
minutes.

1) Vanilla plan (16 min) : https://explain.depesz.com/s/NvDR

2) enable_nestloop=off (4 min):
https://explain.depesz.com/s/buKK

3) enable_nestloop=off; enable_seqscan=off (2 min):
https://explain.depesz.com/s/0WXx



The cost of a query while using the default Vanila plan is very
less compared to the 3rd plan with nested loop and seqscan  being
set to off.  As the JIT is enabled, it seems the planner tries to
select the plan with the least cost and going for the plan which
is taking more time of execution. Can you try running this query
with JIT=off in the session and see if it selects the plan with
the least time for execution?

Thank you for your reply. Here's the result using set jit=off; 
https://explain.depesz.com/s/rpKc. It's essentially the same plan as the 
initial one.




How can I get Postgres not to loop over 12M rows?

Let me know if there is anything I left out here that would be
useful for further debugging.

-- 
Regards


    Sushant



Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Olsen


On 10/20/20 3:22 PM, Victor Yegorov wrote:
вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen >:


I'm looking for some help to manage queries against two large tables.


Also, can you enable `track_io_timing` (no restart required) and 
provide output of `EXPLAIN (analyze, buffers, settings)` for all 3 
variants, please?

(I assume you're on 12+.)


Thanks! Yes on 12.2. Here's the output:

vanilla: https://explain.depesz.com/s/Ktrd

set enable_nestloop=off: https://explain.depesz.com/s/mvSD

set enable_nestloop=off; set enable_seqscan=off: 
https://explain.depesz.com/s/XIDo


Are these helpful?



--
Victor Yegorov


Re: Query Performance / Planner estimate off

2020-10-20 Thread Mats Olsen


On 10/20/20 6:51 PM, Victor Yegorov wrote:
вт, 20 окт. 2020 г. в 16:50, Mats Olsen >:


On 10/20/20 3:04 PM, Victor Yegorov wrote:


вт, 20 окт. 2020 г. в 11:38, Mats Julian Olsen
mailto:[email protected]>>:

I'm looking for some help to manage queries against two large
tables.


Can you tell the version you're running currently and the output
of this query, please?

    select name,setting,source from pg_settings where source not
in ('default','override');


Running "PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on
x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1
20191008, 64-bit"

Updated the gist to include the results forom pg_settings. Here's
the direct link

https://gist.githubusercontent.com/mewwts/9f11ae5e6a5951593b8999559f5418cf/raw/e5dee48680e04570bec4e9a816fa009da34f/pg_settings



It looks like indexes currently chosen by the planner don't quite fit 
your query.


I would create the following index (if it's possible to update schema):
   ON "uniswap_v2.Pair_evt_Mint" (evt_tx_hash, evt_block_time)

I'll try to add this.


Same for the second table, looks like
  ON "ethereum.transactions" (hash, block_time)
is a better fit for your query. In fact, I do not think 
`transactions_block_number_time` index is used frequently, 'cos second 
column of the index is a partitioning key.
I'll see if I can add it. This table is huge so normally we only make 
changes to these when we redeploy the database.


Currently planner wants to go via indexes 'cos you've made random 
access really cheap compared to sequential one (and your findings 
shows this).
Perhaps on a NVMe disks this could work, but in your case you need to 
find the real bottleneck (therefore I asked for buffers).


I would set `random_page_cost` to a 2.5 at least with your numbers. 
Also, I would check DB and indexes for bloat (just a guess now, 'cos 
your plans miss buffers figures)


Yeah, 1.1 seems way to low.

Here's the output of the explain (analyze, buffers, settings) you asked for:

vanilla: https://explain.depesz.com/s/Ktrd

set enable_nestloop=off: https://explain.depesz.com/s/mvSD

set enable_nestloop=off; set enable_seqscan=off: 
https://explain.depesz.com/s/XIDo






--
Victor Yegorov