Re: When to use PARTITION BY HASH?

2020-06-03 Thread Oleksandr Shulgin
On Tue, Jun 2, 2020 at 7:33 PM Justin Pryzby  wrote:

> > To: [email protected],
> [email protected]
>
> Please don't cross post to multiple lists.
>
> On Tue, Jun 02, 2020 at 07:17:11PM +0200, Oleksandr Shulgin wrote:
> > I was reading up on declarative partitioning[1] and I'm not sure what
> could
> > be a possible application of Hash partitioning.
>
> It's a good question.  See Tom's complaint here.
> https://www.postgresql.org/message-id/31605.1586112900%40sss.pgh.pa.us
>
> It *does* provide the benefit of smaller indexes and smaller tables, which
> might allow seq scans to outpeform index scans.
>
> It's maybe only useful for equality conditions on the partition key, and
> not
> for ranges.  Here, it scans a single partition:
>
> postgres=# CREATE TABLE t(i int) PARTITION BY HASH(i); CREATE TABLE t1
> PARTITION OF t FOR VALUES WITH (REMAINDER 0, MODULUS 3);
> postgres=# CREATE TABLE t2 PARTITION OF t FOR VALUES WITH (MODULUS 3,
> REMAINDER 1);
> postgres=# CREATE TABLE t3 PARTITION OF t FOR VALUES WITH (MODULUS 3,
> REMAINDER 2);
> postgres=# INSERT INTO t SELECT i%9 FROM generate_series(1,)i; ANALYZE
> t;
> postgres=# explain analyze SELECT * FROM t WHERE i=3;
>  Seq Scan on t2  (cost=0.00..75.55 rows= width=4) (actual
> time=0.021..0.518 rows= loops=1)
>Filter: (i = 3)
>Rows Removed by Filter: 
>

I see.  So it works with low cardinality in the partitioned column.  With
high cardinality an index scan on an unpartitioned table would be
preferable I guess.

The documentation page I've linked only contains examples around
partitioning BY RANGE.  I believe it'd be helpful to extend it with some
meaningful examples for LIST and HASH partitioning.

Regards,
-- 
Alex


Re: When to use PARTITION BY HASH?

2020-06-03 Thread Justin Pryzby
On Wed, Jun 03, 2020 at 09:45:48AM +0200, Oleksandr Shulgin wrote:
> I see.  So it works with low cardinality in the partitioned column.  With
> high cardinality an index scan on an unpartitioned table would be
> preferable I guess.
> 
> The documentation page I've linked only contains examples around
> partitioning BY RANGE.  I believe it'd be helpful to extend it with some
> meaningful examples for LIST and HASH partitioning.

I agree.  I think it would also be useful to mention the "benefits" which
aren't likely to apply to hash partitioning.

Would you want to propose an example to include ?
Eventually it needs to be submitted as a patch to -hackers.

-- 
Justin




increased max_parallel_workers_per_gather results in fewer workers?

2020-06-03 Thread Philip Semanchuk
Hi all,
I’ve been experimenting with some performance tuning on a particular query, and 
I observed a result that I don’t understand. 

I’ve been setting max_parallel_workers_per_gather to values the range 1-6 and 
then running EXPLAIN ANALYZE to see how much  benefit we get from more 
parallelization. My data is organized by year, so the year is a parameter in 
the query’s WHERE clause.

For my 2018 data, Postgres launches as many workers as 
max_parallel_workers_per_gather permits, and the execution time decreases 
nicely, from 280 seconds with 1 worker all the way down to 141s with 6 workers. 
So far, so good.

When I run the same query for our 2022 data, I get the same behavior 
(improvement) for max_parallel_workers_per_gather values of 1-4. But with 
max_parallel_workers_per_gather set to 5 or 6, Postgres only uses 1 worker, and 
the execution time increases dramatically, even worse than when I deliberately 
limit the number of workers to 1  —

- max_parallel_workers_per_gather=1, runtime = 1061s
- max_parallel_workers_per_gather=2, runtime = 770s
- max_parallel_workers_per_gather=3, runtime = 637s
- max_parallel_workers_per_gather=4, runtime = 573s
- max_parallel_workers_per_gather=5, runtime = 1468s
- max_parallel_workers_per_gather=6, runtime = 1469s

Our 2022 data set is several times larger than our 2018 data, so I suspect some 
resource is getting exhausted, but I’m not sure what. So far, this result has 
been 100% re-creatable. I’m on a dedicated test server with 16 virtual CPUs and 
128Gb RAM; no one else is competing with me for Postgres processes. 
max_worker_processes and max_parallel_workers are both set to 12.

Can anyone help me understand why this happens, or where I might look for 
clues? 

Thanks,
Philip



Re: increased max_parallel_workers_per_gather results in fewer workers?

2020-06-03 Thread Justin Pryzby
On Wed, Jun 03, 2020 at 04:04:13PM -0400, Philip Semanchuk wrote:
> Can anyone help me understand why this happens, or where I might look for 
> clues? 

What version postgres ?

Can you reproduce if you do:
ALTER SYSTEM SET max_parallel_workers_per_gather=0; SELECT pg_reload_conf();
.. and then within the session do: SET max_parallel_workers_per_gather=12;

I guess you should show an explain analyze, specifically "Workers
Planned/Launched", maybe by linking to explain.depesz.com

-- 
Justin




Re: increased max_parallel_workers_per_gather results in fewer workers?

2020-06-03 Thread Philip Semanchuk



> On Jun 3, 2020, at 5:15 PM, Justin Pryzby  wrote:
> 
> On Wed, Jun 03, 2020 at 04:04:13PM -0400, Philip Semanchuk wrote:
>> Can anyone help me understand why this happens, or where I might look for 
>> clues? 
> 
> What version postgres ?

Sorry, I should have posted that in my initial email.

select version();
+-+
| version |
|-|
| PostgreSQL 11.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.9.3, 64-bit |
+-+

This is AWS’ version of Postgres 11.6 (“Aurora”) which of course might make a 
difference.


> Can you reproduce if you do:
> ALTER SYSTEM SET max_parallel_workers_per_gather=0; SELECT pg_reload_conf();
> .. and then within the session do: SET max_parallel_workers_per_gather=12;

Unfortunately under Aurora I’m not superuser so I can’t run ALTER SYSTEM, but I 
can change the config via AWS’ config interface, so I set 
max_parallel_workers_per_gather=0 there.

show max_parallel_workers_per_gather
+---+
| max_parallel_workers_per_gather   |
|---|
| 0 |
+---+
SHOW
Time: 0.034s
postgres@philip-2020-05-19-cluster:wylan>
SET max_parallel_workers_per_gather=12
SET
Time: 0.028s
postgres@philip-2020-05-19-cluster:wylan>
show max_parallel_workers_per_gather
+---+
| max_parallel_workers_per_gather   |
|---|
| 12|
+---+
SHOW

I then ran the EXPLAIN ANALYZE and got the same slow runtime (1473s) and 1 
worker in the EXPLAIN ANALYZE output. 


> I guess you should show an explain analyze, specifically "Workers
> Planned/Launched", maybe by linking to explain.depesz.com

Out of an abundance of caution, our company has a policy of not pasting our 
plans to public servers. However, I can confirm that when I set 
max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s 
in the EXPLAIN ANALYZE output:

 Workers Planned: 1
 Workers Launched: 1

FWIW, the Planning Time reported in EXPLAIN ANALYZE output doesn’t vary 
significantly, only from 411-443ms, and the variation within that range 
correlates only very weakly with max_parallel_workers_per_gather.


thank you 
Philip






Re: increased max_parallel_workers_per_gather results in fewer workers?

2020-06-03 Thread Justin Pryzby
On Wed, Jun 03, 2020 at 06:23:57PM -0400, Philip Semanchuk wrote:
> > On Jun 3, 2020, at 5:15 PM, Justin Pryzby  wrote:
> > What version postgres ?
> 
> This is AWS’ version of Postgres 11.6 (“Aurora”) which of course might make a 
> difference.

> > I guess you should show an explain analyze, specifically "Workers
> > Planned/Launched", maybe by linking to explain.depesz.com
> 
> Out of an abundance of caution, our company has a policy of not pasting our 
> plans to public servers. However, I can confirm that when I set 
> max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s 
> in the EXPLAIN ANALYZE output:
> 
>  Workers Planned: 1
>  Workers Launched: 1

Are you referring to a parallel scan/aggregate/hash/??

Are you able to show a plan for a toy query like SELECT count(col) FROM tbl ,
preferably including a CREATE TABLE tbl AS... ; VACUUM ANALYZE tbl;

Are you able to reproduce with an unpatched postgres ?

-- 
Justin




Re: increased max_parallel_workers_per_gather results in fewer workers?

2020-06-03 Thread Sebastian Dressler
Hi Philip,

> On 4. Jun 2020, at 00:23, Philip Semanchuk  
> wrote:
> 
>> I guess you should show an explain analyze, specifically "Workers
>> Planned/Launched", maybe by linking to explain.depesz.com
> 
> Out of an abundance of caution, our company has a policy of not pasting our 
> plans to public servers. However, I can confirm that when I set 
> max_parallel_workers_per_gather > 4 and the runtime increases, this is what’s 
> in the EXPLAIN ANALYZE output:
> 
> Workers Planned: 1
> Workers Launched: 1

Can you please verify the amount of max_parallel_workers and 
max_worker_processes? It should be roughly max_worker_processes > 
max_parallel_workers > max_parallel_workers_per_gather, for instance:

max_worker_processes = 24
max_parallel_workers = 18
max_parallel_workers_per_gather = 6

Also, there are more configuration settings related to parallel queries you 
might want to look into. Most notably:

parallel_setup_cost
parallel_tuple_cost
min_parallel_table_scan_size

Especially the last one is a typical dealbreaker, you can try to set it to 0 
for the beginning. Good starters for the others are 500 and 0.1 respectively.

> FWIW, the Planning Time reported in EXPLAIN ANALYZE output doesn’t vary 
> significantly, only from 411-443ms, and the variation within that range 
> correlates only very weakly with max_parallel_workers_per_gather.


It can happen, that more parallelism does not help the query but slows it down 
beyond a specific amount of parallel workers. You can see this in EXPLAIN when 
there is for instance a BITMAP HEAP INDEX SCAN or similar involved.

Cheers,
Sebastian

Re: increased max_parallel_workers_per_gather results in fewer workers?

2020-06-03 Thread Luis Carril
Hi,
   on top of the settings that Sebastian suggested, you can also try disabling 
the participation of the leader (i.e. the main backend process for your 
connection) in the distribution of the parallel workload:

SET parallel_leader_participation TO false

  Depending on your workload the leader could be saturated if it has to do a 
share of the workload and aggregate the results of all the workers.

Cheers
Luis