Pg10 : Client Configuration for Parallelism ?

2019-04-16 Thread laurent.dechambe
Hi,

I am working on PostgreSQL 10.5 and I have a discrepancy between clients 
regarding parallelism feature.

For a simple query (say a simple SELECT COUNT(*) FROM BIG_TABLE), I can see 
PostgreSQL use parallelism when the query is launched from psql or PgAdmin4. 
However the same query launched with DBeaver (ie connected through JDBC) does 
not use parallelism. 

SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my 
session.

Is there a client configuration that prevents from using parallelism ?

Thanks.

Laurent

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



RE: Pg10 : Client Configuration for Parallelism ?

2019-04-17 Thread laurent.dechambe
Thanks for the tip. I have compared all settings and they are identical.

Very strange.

-Message d'origine-
De : Andreas Kretschmer [mailto:[email protected]] 
Envoyé : mercredi 17 avril 2019 10:07
À : [email protected]
Objet : Re: Pg10 : Client Configuration for Parallelism ?



Am 17.04.19 um 08:30 schrieb [email protected]:
> SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my 
> session.
>
> Is there a client configuration that prevents from using parallelism ?
unlikely.

if i were you, i would compare all settings, using the different client 
software. (show all, and compare)



Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



RE: Pg10 : Client Configuration for Parallelism ?

2019-04-17 Thread laurent.dechambe
As answered to Andreas Kretschmer all settings are identical.

I have made some other tests, even testing a basic jdbc program (open 
connection, execute statement, display result, close connection)

Here are the logs (with log_error_verbosity = verbose) :


2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;0;LOG:  0: execute 
: SELECT COUNT(1) FROM big_table
2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:1959
2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;0;LOG:  0: duration: 
25950.908 ms
2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:2031


2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;0;LOG:  0: execute 
: SELECT COUNT(1) FROM big_table
2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:1959
2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;0;LOG:  0: duration: 
11459.943 ms
2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:2031


2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;0;LOG:  0: statement: 
SELECT COUNT(1) FROM big_table;
2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;0;LOCATION:  
exec_simple_query, postgres.c:940
2019-04-17 11:33:08 CEST;37324;thedbuser;thedb;0;LOG:  0: duration: 
11334.677 ms
2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;0;LOG:  0: statement: 
SELECT oid, format_type(oid, NULL) AS typname FROM pg_type WHERE oid IN (20) 
ORDER BY oid;
2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;0;LOCATION:  
exec_simple_query, postgres.c:940
2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;0;LOG:  0: duration: 
0.900 ms
2019-04-17 11:33:08 CEST;37313;thedbuser;thedb;0;LOCATION:  
exec_simple_query, postgres.c:1170

I don’t see any difference a part from the query duration.  Note that while 
monitoring the server I saw that there was parallelism with JDBC program and 
PGAdmin4, but not with Dbeaver. And the JDBC driver is the same in both “Basic 
JDBC” and DBeaver.

Regards.

Laurent.



De : Andreas Joseph Krogh [mailto:[email protected]]
Envoyé : mercredi 17 avril 2019 11:08
À : [email protected]
Objet : Sv: Pg10 : Client Configuration for Parallelism ?

På onsdag 17. april 2019 kl. 08:30:28, skrev 
mailto:[email protected]>>:
Hi,

I am working on PostgreSQL 10.5 and I have a discrepancy between clients 
regarding parallelism feature.

For a simple query (say a simple SELECT COUNT(*) FROM BIG_TABLE), I can see 
PostgreSQL use parallelism when the query is launched from psql or PgAdmin4. 
However the same query launched with DBeaver (ie connected through JDBC) does 
not use parallelism.

SELECT current_setting('max_parallel_workers_per_gather')  gives 10 from my 
session.

Is there a client configuration that prevents from using parallelism ?

Thanks.

Laurent

Set in postgresql.conf:

log_statement = 'all'

reload settings and check the logs for what statemets are acutally issued.

--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
[email protected]
www.visena.com
[cid:[email protected]]


_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



RE: Pg10 : Client Configuration for Parallelism ?

2019-04-17 Thread laurent.dechambe
I can see whether there is parallelism with pg_top or barely top on the server. 


   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
 38584 postgres  20   0 8863828 8.153g 8.151g R 100.0  3.2   1:23.01 postgres
10 root  20   0   0  0  0 S   0.3  0.0  88:07.26 rcu_sched


   PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
 46687 postgres  20   0 8864620 0.978g 0.977g S  38.5  0.4   0:01.16 postgres
 46689 postgres  20   0 8864348 996.4m 995.1m R  38.5  0.4   0:01.16 postgres
 46690 postgres  20   0 8864348 987.2m 985.8m S  38.5  0.4   0:01.16 postgres
 46691 postgres  20   0 8864348 998436 997084 R  38.5  0.4   0:01.16 postgres
 46692 postgres  20   0 8864348 982612 981260 S  38.5  0.4   0:01.16 postgres
 46693 postgres  20   0 8864348 979.9m 978.6m R  38.5  0.4   0:01.16 postgres
 46694 postgres  20   0 8864348 987.9m 986.6m S  38.5  0.4   0:01.16 postgres
 46696 postgres  20   0 8864348 996864 995512 S  38.5  0.4   0:01.16 postgres
 46688 postgres  20   0 8864348 982.3m 981.0m R  38.2  0.4   0:01.15 postgres
 46695 postgres  20   0 8864348 986.9m 985.6m S  38.2  0.4   0:01.15 postgres
 21323 postgres  20   0 8862788 8.096g 8.095g S   0.7  3.2   2:24.75 postgres
 46682 postgres  20   0  157996   2596   1548 R   0.7  0.0   0:00.05 top

This is not a matter of cache. If I execute the queries in a different order 
the result will be the same : DBeaver query is longer.

There is something in documentation that says that there won't be parallelism 
if " The client sends an Execute message with a non-zero fetch count."
I am not sure what this sentence means. 

-Message d'origine-
De : Andreas Kretschmer [mailto:[email protected]] 
Envoyé : mercredi 17 avril 2019 12:39
À : [email protected]
Objet : Re: Pg10 : Client Configuration for Parallelism ?



Am 17.04.19 um 11:51 schrieb [email protected]:
>
> Here are the logs (with log_error_verbosity = verbose) :
>
> 
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;0;LOG:  0: 
> execute : SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;0;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:08 CEST;35895;thedbuser;thedb;0;LOG:  0: 
> duration: 25950.908 ms
>
> 
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;0;LOG:  0: 
> execute : SELECT COUNT(1) FROM big_table
>
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;0;LOCATION: 
> exec_execute_message, postgres.c:1959
>
> 2019-04-17 11:31:32 CEST;37257;thedbuser;thedb;0;LOG:  0: 
> duration: 11459.943 ms
>
>
> 
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;0;LOG:  0: 
> statement: SELECT COUNT(1) FROM big_table;
>
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;0;LOCATION:  
> exec_simple_query, postgres.c:940
>
> 2019-04-17 11:33:08 CEST;37324;thedbuser;thedb;0;LOG:  0: 
> duration: 11334.677 ms
>
>

That's compareable. The first one took more time, cold cache. The 2nd 
and 3rd are faster, warm cache.

But: we can't see if the execution is paralell or not. If you want to 
know that, install and use auto_explain.


Regards, Andreas



-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com




_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.



RE: Pg10 : Client Configuration for Parallelism ?

2019-04-17 Thread laurent.dechambe
Auto explain shows that in both cases there are workers planned, but with 
DBeaver they are not launched.

Here's what I get with auto_explain : 

 
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;0;LOG:  0: duration: 
0.095 ms
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;0;LOCATION:  
exec_parse_message, postgres.c:1433
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;0;LOG:  0: duration: 
0.191 ms
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;0;LOCATION:  
exec_bind_message, postgres.c:1813
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;0;LOG:  0: execute 
: SELECT COUNT(1) FROM big_table
2019-04-17 14:46:09 CEST;54882;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:1959
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;0;LOG:  0: duration: 
35842.146 ms
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:2031
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;0;LOG:  0: duration: 
35842.110 ms  plan:
Query Text: SELECT COUNT(1) FROM big_table
Finalize Aggregate  (cost=3081157.61..3081157.62 rows=1 width=8) 
(actual time=35842.072..35842.072 rows=1 loops=1)
  Output: count(1)
  ->  Gather  (cost=3081156.68..3081157.59 rows=9 width=8) (actual 
time=35842.062..35842.062 rows=1 loops=1)
Output: (PARTIAL count(1))
Workers Planned: 9
Workers Launched: 0
->  Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 
width=8) (actual time=35842.060..35842.060 rows=1 loops=1)
  Output: PARTIAL count(1)
  ->  Parallel Index Only Scan using idx_big_table__inact 
on big_table  (cost=0.57..3029148.07 rows=20403444 width=0) (actual 
time=0.036..24038.340 rows=183778867 loops=1)
Heap Fetches: 57043846
2019-04-17 14:46:45 CEST;54882;thedbuser;thedb;0;LOCATION:  
explain_ExecutorEnd, auto_explain.c:359


2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;0;LOCATION:  
exec_parse_message, postgres.c:1433
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;0;LOG:  0: duration: 
2.077 ms
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;0;LOCATION:  
exec_bind_message, postgres.c:1813
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;0;LOG:  0: execute 
: SELECT COUNT(1) FROM big_table
2019-04-17 14:47:39 CEST;55222;thedbuser;thedb;0;LOCATION:  
exec_execute_message, postgres.c:1959
2019-04-17 14:47:50 CEST;55235;;;0;LOG:  0: duration: 11317.118 ms  
plan:
Query Text: SELECT COUNT(1) FROM big_table
Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual 
time=11317.095..11317.095 rows=1 loops=1)
  Output: PARTIAL count(1)
  ->  Parallel Index Only Scan using idx_big_table__inact on big_table  
(cost=0.57..3029148.07 rows=20403444 width=0) (actual time=0.135..10036.104 
rows=18161056 loops=1)
Heap Fetches: 5569541
2019-04-17 14:47:50 CEST;55235;;;0;LOCATION:  explain_ExecutorEnd, 
auto_explain.c:359
2019-04-17 14:47:50 CEST;55236;;;0;LOG:  0: duration: 11316.071 ms  
plan:
Query Text: SELECT COUNT(1) FROM big_table
Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual 
time=11316.043..11316.043 rows=1 loops=1)
  Output: PARTIAL count(1)
  ->  Parallel Index Only Scan using idx_big_table__inact on big_table  
(cost=0.57..3029148.07 rows=20403444 width=0) (actual time=0.171..1.782 
rows=18377525 loops=1)
Heap Fetches: 5735254
2019-04-17 14:47:50 CEST;55236;;;0;LOCATION:  explain_ExecutorEnd, 
auto_explain.c:359
2019-04-17 14:47:50 CEST;55237;;;0;LOG:  0: duration: 11315.871 ms  
plan:
Query Text: SELECT COUNT(1) FROM big_table
Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual 
time=11315.851..11315.852 rows=1 loops=1)
  Output: PARTIAL count(1)
  ->  Parallel Index Only Scan using idx_big_table__inact on big_table  
(cost=0.57..3029148.07 rows=20403444 width=0) (actual time=0.140..10042.102 
rows=18082389 loops=1)
Heap Fetches: 5579176
2019-04-17 14:47:50 CEST;55237;;;0;LOCATION:  explain_ExecutorEnd, 
auto_explain.c:359
2019-04-17 14:47:50 CEST;55232;;;0;LOG:  0: duration: 11317.573 ms  
plan:
Query Text: SELECT COUNT(1) FROM big_table
Partial Aggregate  (cost=3080156.68..3080156.69 rows=1 width=8) (actual 
time=11317.553..11317.553 rows=1 loops=1)
  Output: PARTIAL count(1)
  ->  Parallel Index Only Scan using idx_big_table__inact on big_table  
(cost=0.57..3029148.07 rows=20403444 width=0) (actual time=0.115..10047.908 
rows=18732838 loops=1)
Heap Fetches: 5849965
2019-04-17 14:47:50 CEST;55232;;;0;LOCATION:  explain_ExecutorEnd, 
auto_explain.c:359
2019-04-17 14:47:50 CEST;55234;;;0;LOG:  0: duration: 11317.221 ms  
plan:
Query Text:

RE: Pg10 : Client Configuration for Parallelism ?

2019-04-17 Thread laurent.dechambe
Hello Justin and thank you for your clues.

Finally I found that putting blank to the option that limits the number of rows 
to retrieve (which is normal for this kind of tool) allows PostgreSQL to 
parallelize the query.

On jdbc it seems this is equivalent to write :
statement. setMaxRows(0);  // parallelism authorized, which is the default.

Thus on my jdbc basic program if I add :
statement. setMaxRows(100);  // No parallelism allowed (at least in Pg10)

Thanks to all who were kind enough to help.

Laurent

-Message d'origine-
De : Justin Pryzby [mailto:[email protected]] 
Envoyé : mercredi 17 avril 2019 15:57
À : DECHAMBE Laurent DTSI/DSI
Cc : Andreas Joseph Krogh; [email protected]
Objet : Re: Pg10 : Client Configuration for Parallelism ?

On Wed, Apr 17, 2019 at 09:51:02AM +, [email protected] wrote:
> 
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;0;LOG:  0: execute 
> : SELECT COUNT(1) FROM big_table
> 2019-04-17 11:30:42 CEST;35895;thedbuser;thedb;0;LOCATION:  
> exec_execute_message, postgres.c:1959

"execute" means it's using the extended protocol.
https://www.postgresql.org/docs/11/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

> 
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;0;LOG:  0: execute 
> : SELECT COUNT(1) FROM big_table
> 2019-04-17 11:31:20 CEST;37257;thedbuser;thedb;0;LOCATION:  
> exec_execute_message, postgres.c:1959

Same.

> 
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;0;LOG:  0: statement: 
> SELECT COUNT(1) FROM big_table;
> 2019-04-17 11:32:56 CEST;37324;thedbuser;thedb;0;LOCATION:  
> exec_simple_query, postgres.c:940

This is a "simple query", not using the "extended protocol".

On Wed, Apr 17, 2019 at 11:26:07AM +, [email protected] wrote:
> There is something in documentation that says that there won't be parallelism 
> if " The client sends an Execute message with a non-zero fetch count."
> I am not sure what this sentence means. 

This is likely the cause of the difference.

Could you run wireshark to watch the protocol traffic ?

I think it'll show that dbeaver is retrieving a portion of the result set.

Justin

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.