performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Mariel Cherkassky
Hey all,
I'm testing performance of two identical machines one in 9.6 and the second
one is in 12. The second machine is a clone of the first one + db upgrade
to 12 beta 3 (Yes I'm aware 12.1 was released).

machine stats :
32gb ram
8 cpu
regular hd (not ssd)

my postgresql.confg settings:

max_wal_size = 2GB
min_wal_size = 1GB
wal_buffers = 16MB
checkpoint_completion_target = 0.9
checkpoint_timeout = 30min
log_checkpoints = on
log_lock_waits = on
log_temp_files = 1024
log_min_duration_statement = 1000
log_autovacuum_min_duration = 5000
autovacuum_max_workers = 4
autovacuum_vacuum_cost_limit = 800
autovacuum_vacuum_cost_delay = 10ms
standard_conforming_strings = off
max_locks_per_transaction = 5000
max_connections = 500
log_line_prefix = '%t %d %p  '
random_page_cost = 2.0
deadlock_timeout = 5s
shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 32764
maintenance_work_mem = 250MB
work_mem = 32MB
shared_buffers = 8058MB
effective_cache_size = 16116MB

in 12v I also added the following settings :
log_directory = 'pg_log'
enable_partitionwise_join = on
enable_partitionwise_aggregate = on
max_worker_processes = 8   # (change requires restart)
max_parallel_workers_per_gather = 4# taken from max_parallel_workers
max_parallel_workers = 8   # maximum number of max_worker_pr

I tested a few applications flows and I saw that the 9.6 version is faster.
I also did a few simple tests (enabled \timing) :

12v :
postgres=# create table test1 as select generate_series(1,1);
SELECT 1
Time: 35.099 ms

postgres=# select count(*) from test1;
 count
---
 1
(1 row)

Time: 4.819 ms

9.6v :
postgres=# create table test1 as select generate_series(1,1);
SELECT 1
Time: 19.962 ms

postgres=# select count(*) from test1;
 count
---
 1
(1 row)

Time: 1.541 ms

Any idea what can cause it ? What can I check?
This degredation is visible in many queries that we use ..

After the upgrade to 12v version I run analyze on all tables..

Thanks.


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Thomas Poty
Hello,
did you run ananlyze on your db?

Le dim. 24 nov. 2019 à 13:53, Mariel Cherkassky 
a écrit :

> Hey all,
> I'm testing performance of two identical machines one in 9.6 and the
> second one is in 12. The second machine is a clone of the first one + db
> upgrade to 12 beta 3 (Yes I'm aware 12.1 was released).
>
> machine stats :
> 32gb ram
> 8 cpu
> regular hd (not ssd)
>
> my postgresql.confg settings:
>
> max_wal_size = 2GB
> min_wal_size = 1GB
> wal_buffers = 16MB
> checkpoint_completion_target = 0.9
> checkpoint_timeout = 30min
> log_checkpoints = on
> log_lock_waits = on
> log_temp_files = 1024
> log_min_duration_statement = 1000
> log_autovacuum_min_duration = 5000
> autovacuum_max_workers = 4
> autovacuum_vacuum_cost_limit = 800
> autovacuum_vacuum_cost_delay = 10ms
> standard_conforming_strings = off
> max_locks_per_transaction = 5000
> max_connections = 500
> log_line_prefix = '%t %d %p  '
> random_page_cost = 2.0
> deadlock_timeout = 5s
> shared_preload_libraries = 'pg_stat_statements'
> track_activity_query_size = 32764
> maintenance_work_mem = 250MB
> work_mem = 32MB
> shared_buffers = 8058MB
> effective_cache_size = 16116MB
>
> in 12v I also added the following settings :
> log_directory = 'pg_log'
> enable_partitionwise_join = on
> enable_partitionwise_aggregate = on
> max_worker_processes = 8   # (change requires restart)
> max_parallel_workers_per_gather = 4# taken from max_parallel_workers
> max_parallel_workers = 8   # maximum number of max_worker_pr
>
> I tested a few applications flows and I saw that the 9.6 version is
> faster. I also did a few simple tests (enabled \timing) :
>
> 12v :
> postgres=# create table test1 as select generate_series(1,1);
> SELECT 1
> Time: 35.099 ms
>
> postgres=# select count(*) from test1;
>  count
> ---
>  1
> (1 row)
>
> Time: 4.819 ms
>
> 9.6v :
> postgres=# create table test1 as select generate_series(1,1);
> SELECT 1
> Time: 19.962 ms
>
> postgres=# select count(*) from test1;
>  count
> ---
>  1
> (1 row)
>
> Time: 1.541 ms
>
> Any idea what can cause it ? What can I check?
> This degredation is visible in many queries that we use ..
>
> After the upgrade to 12v version I run analyze on all tables..
>
> Thanks.
>


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Andrew Zakharov
Hi there - 
I have same feelings. Try set max_parallel_workers_per_gather to zero. I don't 
think that comparison non-parallel and parallel versions is correct (don't say 
anything about parallel in 9.6 pls) 
What explain says? I suppose you will have different exec plans. Optimizer 
stranges of 11 and 12 ver have been discussed. Look thru the archive, but I 
didn't remember the problem status - resolved or not.
Andrew. 

24 ноября 2019 г. 15:53:19 GMT+03:00, Mariel Cherkassky 
 пишет:
>Hey all,
>I'm testing performance of two identical machines one in 9.6 and the
>second
>one is in 12. The second machine is a clone of the first one + db
>upgrade
>to 12 beta 3 (Yes I'm aware 12.1 was released).
>
>machine stats :
>32gb ram
>8 cpu
>regular hd (not ssd)
>
>my postgresql.confg settings:
>
>max_wal_size = 2GB
>min_wal_size = 1GB
>wal_buffers = 16MB
>checkpoint_completion_target = 0.9
>checkpoint_timeout = 30min
>log_checkpoints = on
>log_lock_waits = on
>log_temp_files = 1024
>log_min_duration_statement = 1000
>log_autovacuum_min_duration = 5000
>autovacuum_max_workers = 4
>autovacuum_vacuum_cost_limit = 800
>autovacuum_vacuum_cost_delay = 10ms
>standard_conforming_strings = off
>max_locks_per_transaction = 5000
>max_connections = 500
>log_line_prefix = '%t %d %p  '
>random_page_cost = 2.0
>deadlock_timeout = 5s
>shared_preload_libraries = 'pg_stat_statements'
>track_activity_query_size = 32764
>maintenance_work_mem = 250MB
>work_mem = 32MB
>shared_buffers = 8058MB
>effective_cache_size = 16116MB
>
>in 12v I also added the following settings :
>log_directory = 'pg_log'
>enable_partitionwise_join = on
>enable_partitionwise_aggregate = on
>max_worker_processes = 8   # (change requires restart)
>max_parallel_workers_per_gather = 4# taken from
>max_parallel_workers
>max_parallel_workers = 8   # maximum number of
>max_worker_pr
>
>I tested a few applications flows and I saw that the 9.6 version is
>faster.
>I also did a few simple tests (enabled \timing) :
>
>12v :
>postgres=# create table test1 as select generate_series(1,1);
>SELECT 1
>Time: 35.099 ms
>
>postgres=# select count(*) from test1;
> count
>---
> 1
>(1 row)
>
>Time: 4.819 ms
>
>9.6v :
>postgres=# create table test1 as select generate_series(1,1);
>SELECT 1
>Time: 19.962 ms
>
>postgres=# select count(*) from test1;
> count
>---
> 1
>(1 row)
>
>Time: 1.541 ms
>
>Any idea what can cause it ? What can I check?
>This degredation is visible in many queries that we use ..
>
>After the upgrade to 12v version I run analyze on all tables..
>
>Thanks.

--
С уважением,
Андрей Захаров

Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Mariel Cherkassky
Hey Andrew,
It seems that changing this parameter worked for me.
Setting it to zero means that there wont be any parallel workers for one
query right ?
Is it something familiar this problem with the gatherers ?


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Jeff Janes
On Sun, Nov 24, 2019 at 7:53 AM Mariel Cherkassky <
[email protected]> wrote:

The second machine is a clone of the first one + db upgrade to 12 beta 3
> (Yes I'm aware 12.1 was released).
>

So then fix it.  Why spend time investigating obsolete software?  Was
12Beta3 compiled with --enable-cassert?

Cheers,

Jeff

>


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Jeff Janes
On Sun, Nov 24, 2019 at 8:52 AM Mariel Cherkassky <
[email protected]> wrote:

> Hey Andrew,
> It seems that changing this parameter worked for me.
> Setting it to zero means that there wont be any parallel workers for one
> query right ?
> Is it something familiar this problem with the gatherers ?
>

Your example would not be using parallel workers anyway, regardless of the
setting of max_parallel_workers_per_gather, so I don't see how changing
this could have worked for you.  Unless you mean it worked in your full
test, rather than in your test case. I doubt your test case benchmarking
was very reliable to start with, you only show a single execution and
didn't indicate you had more unshown ones.

If I do more credible benchmarking, I do get a performance regression but
it closer is to 16% than to 3 fold.  And it doesn't depend on the setting
of max_parallel_workers_per_gather.  I doubt a regression of this size is
even worth investigating.

pgbench -T300 -P5 -f <(echo "select count(*) from test1") -p 9912 -n -M
prepared

Cheers,

Jeff


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Mariel Cherkassky
Hey Jeff,
This example was only used to show that pg96  had better perfomance than
pg12 in a very simple case.
 In all the tests that I run most of the queries took less time on 9.6`s
version.  I dont know why, but as you can see after disabling the parameter
the simple test that I did showed different results.  I intend to test this
theory tomorrow. I'm going to disable the parameter and run the same
application flows that I have on both machines (9.6 vs 12 with zero value
for the param).

I didnt send this mail after doing just one simple test, I have more than
100 queries that work better on 9.6 . If u have any explanation I will be
happy to hear.
I'll update tomorrow once I'll have the results..


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread John Felix

Op 24-11-2019 om 19:05 schreef Mariel Cherkassky:

Hey Jeff,
This example was only used to show that pg96  had better perfomance 
than pg12 in a very simple case.
 In all the tests that I run most of the queries took less time on 
9.6`s version.  I dont know why, but as you can see after 
disabling the parameter the simple test that I did showed different 
results.  I intend to test this theory tomorrow. I'm going to disable 
the parameter and run the same application flows that I have on both 
machines (9.6 vs 12 with zero value for the param).


I didnt send this mail after doing just one simple test, I have more 
than 100 queries that work better on 9.6 . If u have any explanation I 
will be happy to hear.

I'll update tomorrow once I'll have the results..

I've had the same experience with parallel query. By default parallel 
query is disabled in 9.6. When we upgraded from 9.6 to 10 is was 
significant slower till I disabled parallel query.


We have a very small database (40 tables and all together max 4GB) and 
we have no long running queries (the largest queries run max 2 to 3 
seconds). That's when parallel query gives a performance degrade in my 
opinion.


Best regards,

John Felix




Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Jeff Janes
On Sun, Nov 24, 2019 at 1:05 PM Mariel Cherkassky <
[email protected]> wrote:

> Hey Jeff,
> This example was only used to show that pg96  had better perfomance than
> pg12 in a very simple case.
>

OK, but do you agree that a 15% slow down is more realistic than 3 fold
one?  Or are you still getting 3 fold slow down with more careful testing
and over a wide variety of queries?

I find that the main regression (about 15%) in your example occurs in major
version 10, at the following commit:

commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755
Author: Andres Freund 
Date:   Tue Mar 14 15:45:36 2017 -0700

Faster expression evaluation and targetlist projection.

It is disappointing that this made this case slower rather than faster, and
that the "future work" alluded to either hasn't happened, or wasn't
effective for this example.  I also tested the same example, only 100 times
more rows, and still see the regression at about 16%.  This is a major
infrastructure change patch which has been extensively built on since then,
the chances of reverting it are very small.  It is making an omelette, and
your example is one of the eggs that got broken.

Performance changes in a large body of queries are usually not all due to
the same thing.  Are you a position to custom compile your own PostgreSQL?
It would be nice to test this commit against the one before it, and see how
much of the change in your real queries is explained by this one thing (or
whether any of it is)


>  In all the tests that I run most of the queries took less time on 9.6`s
> version.  I dont know why, but as you can see after disabling the parameter
> the simple test that I did showed different results.
>

I can't see--You didn't post results for that.  And running your test on my
own system doesn't show that at all.  In your test case,
max_parallel_workers_per_gather makes no difference.  With 100 times more
rows, setting it to 0 actually slows things down, as at that size
parallelization is useful and disabling it hurts.

Of course parallel query might be hurting some of the other queries, but
for the one example you show you will have to show something more
convincing for me to believe that that is what caused it.

It is easy to benchmark with something like:

PGOPTIONS="-c max_parallel_workers_per_gather=0" pgbench -T30 -f <(echo
"select count(*) from test1") -p 9912 -n -M prepared

If it is other queries where mpwpg is making a difference, than one issue
could be that your settings of parallel_setup_cost and/or
parllel_tuple_cost are too low (although I usually find the default
settings too high, not too low); or you are running your test concurrently
already and so don't need parallel query to fully load the CPUs and trying
to use parallel query just increases the overhead; or your machine doesn't
have the number of truly effective CPUs you think it does.

Cheers,

Jeff

>


Re: performance degredation after upgrade from 9.6 to 12

2019-11-24 Thread Mariel Cherkassky
Hey Jeff,
First of all thank you again for the quick response. I really appreciate
your comments.
Unfortunately I installed pg from rpms so I cant compile my current env but
I can install it from source code and migrate the data via pg_dump. Can you
explain how can I compile the sources without this commit ?
I understand that my test such a good test but I thought that using
something simple can show my issue. I was sure that I added the results of
the same tests after disabling the param but it seems that I didnt do it.
I'll send it tomorrow because right now im not inforont of my pc. The
bottom line, after disabling this param the same query took less time in
pg12(significly).
>From this email chain and other peoples comments it seems that I'm not the
only one who was facing this issue. One thing that might worth mentitoning,
in all my tests I didnt use a big env, my db was at max 30GB and the max
duration of a query was about 10s. It seems that this happens when we use
queries with short duration (in your test you mentioned u used the same
test but with 100 times more rows, maybe that is the reason it took more
time ?) - maybe on "short" queries it creates degredation ? It is just an
assumption...

I'll try to do most tests  tomorrow and update with results.

>