performance degredation after upgrade from 9.6 to 12
Hey all, I'm testing performance of two identical machines one in 9.6 and the second one is in 12. The second machine is a clone of the first one + db upgrade to 12 beta 3 (Yes I'm aware 12.1 was released). machine stats : 32gb ram 8 cpu regular hd (not ssd) my postgresql.confg settings: max_wal_size = 2GB min_wal_size = 1GB wal_buffers = 16MB checkpoint_completion_target = 0.9 checkpoint_timeout = 30min log_checkpoints = on log_lock_waits = on log_temp_files = 1024 log_min_duration_statement = 1000 log_autovacuum_min_duration = 5000 autovacuum_max_workers = 4 autovacuum_vacuum_cost_limit = 800 autovacuum_vacuum_cost_delay = 10ms standard_conforming_strings = off max_locks_per_transaction = 5000 max_connections = 500 log_line_prefix = '%t %d %p ' random_page_cost = 2.0 deadlock_timeout = 5s shared_preload_libraries = 'pg_stat_statements' track_activity_query_size = 32764 maintenance_work_mem = 250MB work_mem = 32MB shared_buffers = 8058MB effective_cache_size = 16116MB in 12v I also added the following settings : log_directory = 'pg_log' enable_partitionwise_join = on enable_partitionwise_aggregate = on max_worker_processes = 8 # (change requires restart) max_parallel_workers_per_gather = 4# taken from max_parallel_workers max_parallel_workers = 8 # maximum number of max_worker_pr I tested a few applications flows and I saw that the 9.6 version is faster. I also did a few simple tests (enabled \timing) : 12v : postgres=# create table test1 as select generate_series(1,1); SELECT 1 Time: 35.099 ms postgres=# select count(*) from test1; count --- 1 (1 row) Time: 4.819 ms 9.6v : postgres=# create table test1 as select generate_series(1,1); SELECT 1 Time: 19.962 ms postgres=# select count(*) from test1; count --- 1 (1 row) Time: 1.541 ms Any idea what can cause it ? What can I check? This degredation is visible in many queries that we use .. After the upgrade to 12v version I run analyze on all tables.. Thanks.
Re: performance degredation after upgrade from 9.6 to 12
Hello, did you run ananlyze on your db? Le dim. 24 nov. 2019 à 13:53, Mariel Cherkassky a écrit : > Hey all, > I'm testing performance of two identical machines one in 9.6 and the > second one is in 12. The second machine is a clone of the first one + db > upgrade to 12 beta 3 (Yes I'm aware 12.1 was released). > > machine stats : > 32gb ram > 8 cpu > regular hd (not ssd) > > my postgresql.confg settings: > > max_wal_size = 2GB > min_wal_size = 1GB > wal_buffers = 16MB > checkpoint_completion_target = 0.9 > checkpoint_timeout = 30min > log_checkpoints = on > log_lock_waits = on > log_temp_files = 1024 > log_min_duration_statement = 1000 > log_autovacuum_min_duration = 5000 > autovacuum_max_workers = 4 > autovacuum_vacuum_cost_limit = 800 > autovacuum_vacuum_cost_delay = 10ms > standard_conforming_strings = off > max_locks_per_transaction = 5000 > max_connections = 500 > log_line_prefix = '%t %d %p ' > random_page_cost = 2.0 > deadlock_timeout = 5s > shared_preload_libraries = 'pg_stat_statements' > track_activity_query_size = 32764 > maintenance_work_mem = 250MB > work_mem = 32MB > shared_buffers = 8058MB > effective_cache_size = 16116MB > > in 12v I also added the following settings : > log_directory = 'pg_log' > enable_partitionwise_join = on > enable_partitionwise_aggregate = on > max_worker_processes = 8 # (change requires restart) > max_parallel_workers_per_gather = 4# taken from max_parallel_workers > max_parallel_workers = 8 # maximum number of max_worker_pr > > I tested a few applications flows and I saw that the 9.6 version is > faster. I also did a few simple tests (enabled \timing) : > > 12v : > postgres=# create table test1 as select generate_series(1,1); > SELECT 1 > Time: 35.099 ms > > postgres=# select count(*) from test1; > count > --- > 1 > (1 row) > > Time: 4.819 ms > > 9.6v : > postgres=# create table test1 as select generate_series(1,1); > SELECT 1 > Time: 19.962 ms > > postgres=# select count(*) from test1; > count > --- > 1 > (1 row) > > Time: 1.541 ms > > Any idea what can cause it ? What can I check? > This degredation is visible in many queries that we use .. > > After the upgrade to 12v version I run analyze on all tables.. > > Thanks. >
Re: performance degredation after upgrade from 9.6 to 12
Hi there - I have same feelings. Try set max_parallel_workers_per_gather to zero. I don't think that comparison non-parallel and parallel versions is correct (don't say anything about parallel in 9.6 pls) What explain says? I suppose you will have different exec plans. Optimizer stranges of 11 and 12 ver have been discussed. Look thru the archive, but I didn't remember the problem status - resolved or not. Andrew. 24 ноября 2019 г. 15:53:19 GMT+03:00, Mariel Cherkassky пишет: >Hey all, >I'm testing performance of two identical machines one in 9.6 and the >second >one is in 12. The second machine is a clone of the first one + db >upgrade >to 12 beta 3 (Yes I'm aware 12.1 was released). > >machine stats : >32gb ram >8 cpu >regular hd (not ssd) > >my postgresql.confg settings: > >max_wal_size = 2GB >min_wal_size = 1GB >wal_buffers = 16MB >checkpoint_completion_target = 0.9 >checkpoint_timeout = 30min >log_checkpoints = on >log_lock_waits = on >log_temp_files = 1024 >log_min_duration_statement = 1000 >log_autovacuum_min_duration = 5000 >autovacuum_max_workers = 4 >autovacuum_vacuum_cost_limit = 800 >autovacuum_vacuum_cost_delay = 10ms >standard_conforming_strings = off >max_locks_per_transaction = 5000 >max_connections = 500 >log_line_prefix = '%t %d %p ' >random_page_cost = 2.0 >deadlock_timeout = 5s >shared_preload_libraries = 'pg_stat_statements' >track_activity_query_size = 32764 >maintenance_work_mem = 250MB >work_mem = 32MB >shared_buffers = 8058MB >effective_cache_size = 16116MB > >in 12v I also added the following settings : >log_directory = 'pg_log' >enable_partitionwise_join = on >enable_partitionwise_aggregate = on >max_worker_processes = 8 # (change requires restart) >max_parallel_workers_per_gather = 4# taken from >max_parallel_workers >max_parallel_workers = 8 # maximum number of >max_worker_pr > >I tested a few applications flows and I saw that the 9.6 version is >faster. >I also did a few simple tests (enabled \timing) : > >12v : >postgres=# create table test1 as select generate_series(1,1); >SELECT 1 >Time: 35.099 ms > >postgres=# select count(*) from test1; > count >--- > 1 >(1 row) > >Time: 4.819 ms > >9.6v : >postgres=# create table test1 as select generate_series(1,1); >SELECT 1 >Time: 19.962 ms > >postgres=# select count(*) from test1; > count >--- > 1 >(1 row) > >Time: 1.541 ms > >Any idea what can cause it ? What can I check? >This degredation is visible in many queries that we use .. > >After the upgrade to 12v version I run analyze on all tables.. > >Thanks. -- С уважением, Андрей Захаров
Re: performance degredation after upgrade from 9.6 to 12
Hey Andrew, It seems that changing this parameter worked for me. Setting it to zero means that there wont be any parallel workers for one query right ? Is it something familiar this problem with the gatherers ?
Re: performance degredation after upgrade from 9.6 to 12
On Sun, Nov 24, 2019 at 7:53 AM Mariel Cherkassky < [email protected]> wrote: The second machine is a clone of the first one + db upgrade to 12 beta 3 > (Yes I'm aware 12.1 was released). > So then fix it. Why spend time investigating obsolete software? Was 12Beta3 compiled with --enable-cassert? Cheers, Jeff >
Re: performance degredation after upgrade from 9.6 to 12
On Sun, Nov 24, 2019 at 8:52 AM Mariel Cherkassky < [email protected]> wrote: > Hey Andrew, > It seems that changing this parameter worked for me. > Setting it to zero means that there wont be any parallel workers for one > query right ? > Is it something familiar this problem with the gatherers ? > Your example would not be using parallel workers anyway, regardless of the setting of max_parallel_workers_per_gather, so I don't see how changing this could have worked for you. Unless you mean it worked in your full test, rather than in your test case. I doubt your test case benchmarking was very reliable to start with, you only show a single execution and didn't indicate you had more unshown ones. If I do more credible benchmarking, I do get a performance regression but it closer is to 16% than to 3 fold. And it doesn't depend on the setting of max_parallel_workers_per_gather. I doubt a regression of this size is even worth investigating. pgbench -T300 -P5 -f <(echo "select count(*) from test1") -p 9912 -n -M prepared Cheers, Jeff
Re: performance degredation after upgrade from 9.6 to 12
Hey Jeff, This example was only used to show that pg96 had better perfomance than pg12 in a very simple case. In all the tests that I run most of the queries took less time on 9.6`s version. I dont know why, but as you can see after disabling the parameter the simple test that I did showed different results. I intend to test this theory tomorrow. I'm going to disable the parameter and run the same application flows that I have on both machines (9.6 vs 12 with zero value for the param). I didnt send this mail after doing just one simple test, I have more than 100 queries that work better on 9.6 . If u have any explanation I will be happy to hear. I'll update tomorrow once I'll have the results..
Re: performance degredation after upgrade from 9.6 to 12
Op 24-11-2019 om 19:05 schreef Mariel Cherkassky: Hey Jeff, This example was only used to show that pg96 had better perfomance than pg12 in a very simple case. In all the tests that I run most of the queries took less time on 9.6`s version. I dont know why, but as you can see after disabling the parameter the simple test that I did showed different results. I intend to test this theory tomorrow. I'm going to disable the parameter and run the same application flows that I have on both machines (9.6 vs 12 with zero value for the param). I didnt send this mail after doing just one simple test, I have more than 100 queries that work better on 9.6 . If u have any explanation I will be happy to hear. I'll update tomorrow once I'll have the results.. I've had the same experience with parallel query. By default parallel query is disabled in 9.6. When we upgraded from 9.6 to 10 is was significant slower till I disabled parallel query. We have a very small database (40 tables and all together max 4GB) and we have no long running queries (the largest queries run max 2 to 3 seconds). That's when parallel query gives a performance degrade in my opinion. Best regards, John Felix
Re: performance degredation after upgrade from 9.6 to 12
On Sun, Nov 24, 2019 at 1:05 PM Mariel Cherkassky < [email protected]> wrote: > Hey Jeff, > This example was only used to show that pg96 had better perfomance than > pg12 in a very simple case. > OK, but do you agree that a 15% slow down is more realistic than 3 fold one? Or are you still getting 3 fold slow down with more careful testing and over a wide variety of queries? I find that the main regression (about 15%) in your example occurs in major version 10, at the following commit: commit b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755 Author: Andres Freund Date: Tue Mar 14 15:45:36 2017 -0700 Faster expression evaluation and targetlist projection. It is disappointing that this made this case slower rather than faster, and that the "future work" alluded to either hasn't happened, or wasn't effective for this example. I also tested the same example, only 100 times more rows, and still see the regression at about 16%. This is a major infrastructure change patch which has been extensively built on since then, the chances of reverting it are very small. It is making an omelette, and your example is one of the eggs that got broken. Performance changes in a large body of queries are usually not all due to the same thing. Are you a position to custom compile your own PostgreSQL? It would be nice to test this commit against the one before it, and see how much of the change in your real queries is explained by this one thing (or whether any of it is) > In all the tests that I run most of the queries took less time on 9.6`s > version. I dont know why, but as you can see after disabling the parameter > the simple test that I did showed different results. > I can't see--You didn't post results for that. And running your test on my own system doesn't show that at all. In your test case, max_parallel_workers_per_gather makes no difference. With 100 times more rows, setting it to 0 actually slows things down, as at that size parallelization is useful and disabling it hurts. Of course parallel query might be hurting some of the other queries, but for the one example you show you will have to show something more convincing for me to believe that that is what caused it. It is easy to benchmark with something like: PGOPTIONS="-c max_parallel_workers_per_gather=0" pgbench -T30 -f <(echo "select count(*) from test1") -p 9912 -n -M prepared If it is other queries where mpwpg is making a difference, than one issue could be that your settings of parallel_setup_cost and/or parllel_tuple_cost are too low (although I usually find the default settings too high, not too low); or you are running your test concurrently already and so don't need parallel query to fully load the CPUs and trying to use parallel query just increases the overhead; or your machine doesn't have the number of truly effective CPUs you think it does. Cheers, Jeff >
Re: performance degredation after upgrade from 9.6 to 12
Hey Jeff, First of all thank you again for the quick response. I really appreciate your comments. Unfortunately I installed pg from rpms so I cant compile my current env but I can install it from source code and migrate the data via pg_dump. Can you explain how can I compile the sources without this commit ? I understand that my test such a good test but I thought that using something simple can show my issue. I was sure that I added the results of the same tests after disabling the param but it seems that I didnt do it. I'll send it tomorrow because right now im not inforont of my pc. The bottom line, after disabling this param the same query took less time in pg12(significly). >From this email chain and other peoples comments it seems that I'm not the only one who was facing this issue. One thing that might worth mentitoning, in all my tests I didnt use a big env, my db was at max 30GB and the max duration of a query was about 10s. It seems that this happens when we use queries with short duration (in your test you mentioned u used the same test but with 100 times more rows, maybe that is the reason it took more time ?) - maybe on "short" queries it creates degredation ? It is just an assumption... I'll try to do most tests tomorrow and update with results. >
