Runtime partition pruning
Hello list, Is runtime partition pruning available in PostgreSQL 12? I have a table partitioned by range on column id (primary key). For the query (client_id is passed as a parameter from the application): select * from test where id between>0 and and id<1000 and client_id=? ; partition pruning works fine. However select * from test where id between client_id-10 and client_id+10 and client_id=?; does not (it scans all partitions in parallel) . Is it expected? Regards, Radu
Re: Runtime partition pruning
Thanks. Yes, the query with the same parameter seems to work as expected. It doesn't help us though as we are trying to transparently support partitioning using an ORM and we cannot change the parameters. Using the column name would have been much easier. Regards, Radu On Mon, Mar 23, 2020 at 5:56 PM Michael Lewis wrote: > select * from test where id between client_id-10 and client_id+10 and >> client_id=?; >> >> does not (it scans all partitions in parallel) . >> Is it expected? >> > > Yes. But the below would work fine I expect since the planner would know a > constant range for id. I would be very surprised if the optimizer had some > condition rewrite rules to handle just the scenario you show. > > select * from test where id between ?-10 and ?+10 and client_id=?; >
How to debug "background worker "parallel worker" (PID 32152) was terminated by signal 11: Segmentation fault"
Hi, Can you guide me how to debug postgresql crash? I have a development VM (kvm) running Centos 8.1(up to date with all patches, kernel 4.18.0-147.8.1.el8_1.x86_64, postgresql12-server-12.2-2PGDG.rhel8.x86_64 ) with a very large partitioned table (few hundred GBs) . Creating an index on partition tables crashed the postgresql instance. I have in the log: 2020-04-26 17:35:50.065 EEST [8385] LOG: background worker "parallel worker" (PID 32152) was terminated by signal 11: Segmentation fault 2020-04-26 17:35:50.065 EEST [8385] DETAIL: Failed process was running: CREATE INDEX IDX_instrid ON ach.achiteminfo USING btree (instrid); 2020-04-26 17:35:50.065 EEST [8385] LOG: terminating any other active server processes 2020-04-26 17:35:50.065 EEST [32134] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.065 EEST [32134] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.065 EEST [32134] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.065 EEST [32146] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.065 EEST [32146] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.065 EEST [32146] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.065 EEST [32126] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.065 EEST [32126] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.065 EEST [32126] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.066 EEST [32130] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.066 EEST [32130] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.066 EEST [32130] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.066 EEST [32129] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.066 EEST [32129] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.066 EEST [32129] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.067 EEST [32132] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.067 EEST [32132] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.067 EEST [32132] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.067 EEST [32142] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.067 EEST [32142] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.067 EEST [32142] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.068 EEST [32143] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.068 EEST [32143] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.068 EEST [32143] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26 17:35:50.068 EEST [32140] WARNING: terminating connection because of crash of another server process 2020-04-26 17:35:50.068 EEST [32140] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2020-04-26 17:35:50.068 EEST [32140] HINT: In a moment you should be able to reconnect to the database and repeat your command. 2020-04-26
Postgresql 16.3 Out Of Memory
Hello, I have an out of memory problem after upgrading from postgresql 12 to 16.3. I have identified one query that can reproduce the error on demand. Once the query starts, it will eventually exhaust all RAM and swap until the OOM killer will stop postgresql. The setup is as follows: - One VMWARE VM, RHEL 8.9, 64 GB RAM, 16 vCPU, 32 GB swap. Database is used by a local java application that takes up to 22 GB (hard limit). The application uses 2 db connections (long running, app uses a connection pool). The database has about 10 tables, with 3 large tables involved in the problem query: relation | total_size ---+ outputrequest | 217 GB inputrequest | 149 GB tran | 5041 MB I have the following settings: shared_buffers = '8192MB' effective_cache_size = '24GB' maintenance_work_mem = '2GB' checkpoint_completion_target = '0.9' wal_buffers = '16MB' default_statistics_target = '100' random_page_cost = '1.1' effective_io_concurrency = '200' work_mem = '104857kB' min_wal_size = '1GB' max_wal_size = '4GB' max_worker_processes = '16' max_parallel_workers_per_gather = '4' max_parallel_workers = '16' max_parallel_maintenance_workers = '4' jit = 'off' Beside OOM there is another (possibly related)problem: the postgresql memory usage is significantly higher than the shared_buffers + the rest of the memory used per connection. On my system with shared buffers 8G and work_mem ~ 100M I have memory usage of 20GB with the system running just simple inserts (the memory might have increased due to previous queries). I have found out that if I set enable_parallel_hash = 'off', I do get high memory usage for the problem query (above 30G) but not OOM. With the enable_parallel_hash = 'on', I will always get OOM. For simplicity I've restricted postgresql systemd service to cap the memory at 36 GB (so that I don't have to wait for the 32 GB swap to be filled before OOM). This is the memory usage I get when running the query, before systemd restarts postgres - memory usage is 35.9 GB just below the 36 GB limit (the query is running with explain analyze,buffers, the other 2 idle connections are from the application): [postgres@ips3 ~]$ systemctl status postgresql-16; top -u postgres -n 1 -c ● postgresql-16.service - PostgreSQL 16 database server Loaded: loaded (/etc/systemd/system/postgresql-16.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/postgresql-16.service.d └─override.conf Active: active (running) since Mon 2024-06-03 04:23:16 +08; 11h ago Docs: https://www.postgresql.org/docs/16/static/ Process: 283703 ExecStartPre=/usr/pgsql-16/bin/postgresql-16-check-db-dir ${PGDATA} (code=exited, status=0/SUCCESS) Main PID: 283709 (postgres) Tasks: 14 (limit: 408404) Memory: 35.9G (limit: 36.0G) CGroup: /system.slice/postgresql-16.service ├─283709 /usr/pgsql-16/bin/postgres -D /var/lib/pgsql/16/data/ ├─283710 postgres: logger ├─295499 postgres: checkpointer ├─295500 postgres: background writer ├─295506 postgres: walwriter ├─295507 postgres: autovacuum launcher ├─295508 postgres: logical replication launcher ├─295511 postgres: db1 db1 127.0.0.1(37720) idle ├─295946 postgres: postgres db1 [local] EXPLAIN ├─295947 postgres: parallel worker for PID 295946 ├─295948 postgres: parallel worker for PID 295946 ├─295949 postgres: parallel worker for PID 295946 ├─295950 postgres: parallel worker for PID 295946 └─296037 postgres: db1 db1 127.0.0.1(41708) idle top - 16:14:32 up 14 days, 4:42, 1 user, load average: 5.41, 3.58, 2.25 Tasks: 405 total, 6 running, 399 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.1 us, 5.2 sy, 0.0 ni, 71.2 id, 21.3 wa, 0.7 hi, 0.4 si, 0.0 st MiB Mem : 63873.4 total, 2162.0 free, 57177.8 used, 4533.6 buff/cache MiB Swap: 32260.0 total, 3813.8 free, 28446.2 used. 4349.0 avail Mem PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 295947 postgres 20 0 21.2g 7.4g 654168 R 20.0 11.8 1:31.81 postgres: parallel worker for PID 295946 295949 postgres 20 0 21.3g 7.5g 656316 R 20.0 12.0 1:32.57 postgres: parallel worker for PID 295946 295950 postgres 20 0 21.2g 7.5g 654756 R 20.0 12.1 1:31.99 postgres: parallel worker for PID 295946 295946 postgres 20 0 21.2g 7.4g 655684 R 13.3 11.8 1:31.86 postgres: postgres db1 [local] EXPLAIN 295948 postgres 20 0 21.6g 6.0g 656024 R 13.3 9.6 1:37.91 postgres: parallel worker for PID 295946 283709 postgres 20 0 8960628 44144 43684 S 0.0 0.1 0:42.28 /usr/pgsql-16/bin/postgres -D /var/lib/pgsql/16/data/ 283710 postgres 20 0 355288 4876 4576 S 0.0 0.0 0:04.42 postgres: logger 295499 postgres 20 0 8960936 107376 106868 S 0.0 0.2 0:01.75 postgres: checkpointer 295500 postg
Re: Postgresql 16.3 Out Of Memory
It looks like I did not copy the list. I did run VACUUM ANALYZE after the upgrade. I've even run it now > before getting a new explain plan that is very similar (just the costs > differ): > Gather Merge (cost=12336145.92..16111570.23 rows=31531596 width=66) >Workers Planned: 4 >-> Sort (cost=12335145.86..12354853.11 rows=7882899 width=66) > Sort Key: t.msg_status DESC, t.input_sequence > -> Parallel Hash Right Join (cost=9376528.66..11242773.26 > rows=7882899 width=66) >Hash Cond: (snd_tro.reply_input_sequence = t.input_sequence) >-> Parallel Seq Scan on tbl_outputrequest snd_tro > (cost=0.00..1431919.45 rows=20057645 width=16) >-> Parallel Hash (cost=9254599.76..9254599.76 > rows=5245992 width=58) > -> Parallel Hash Right Join > (cost=7458636.44..9254599.76 rows=5245992 width=58) >Hash Cond: (rec_tro.input_sequence = > t.input_sequence) >-> Parallel Seq Scan on tbl_outputrequest > rec_tro (cost=0.00..1431919.45 rows=20057645 width=16) >-> Parallel Hash (cost=7380902.99..7380902.99 > rows=3491156 width=50) > -> Parallel Hash Right Join > (cost=5592677.17..7380902.99 rows=3491156 width=50) >Hash Cond: > (rpl_snd_tro.reply_input_sequence = r.input_sequence) >-> Parallel Seq Scan on > tbl_outputrequest rpl_snd_tro (cost=0.00..1431919.45 rows=20057645 > width=16) >-> Parallel Hash > (cost=5518353.72..5518353.72 rows=3491156 width=42) > -> Parallel Hash Right Join > (cost=3729209.40..5518353.72 rows=3491156 width=42) >Hash Cond: > (rpl_rec_tro.input_sequence = r.input_sequence) >-> Parallel Seq Scan > on tbl_outputrequest rpl_rec_tro (cost=0.00..1431919.45 rows=20057645 > width=16) >-> Parallel Hash > (cost=3658294.95..3658294.95 rows=3491156 width=34) > -> Parallel Hash > Right Join (cost=1883503.35..3658294.95 rows=3491156 width=34) >Hash Cond: > (r.originalrequest_id = t.input_sequence) >-> > Parallel Seq Scan on tbl_inputrequest r (cost=0.00..1739752.66 > rows=13348166 width=16) >-> > Parallel Hash (cost=1839863.91..1839863.91 rows=3491156 width=26) > -> > Parallel Seq Scan on tbl_inputrequest t (cost=0.00..1839863.91 > rows=3491156 width=26) > > Filter: ((receive_time < '2024-05-17 00:00:00'::timestamp without time > zone) AND (input_sequence < '202406020168279904'::bigint) AND > ((msg_status)::text = ANY ('{COMPLETED,REJECTED}'::text[]))) > > The query cost is high and it returns a significant number of rows. > However it should not consume 64+ GB RAM with the > default enable_parallel_hash = 'on' when my shared_buffers is 8GB. The > temporary fix I've implemented to get the system working is a change in the > application configuration so that the timestamp filter selects fewer rows > (3x less), together with setting enable_parallel_hash = 'off'. PostgreSQL > service memory usage grows and stays over 20GB even with this setting. I'd > like to find out exactly what causes the high memory usage as we have other > projects using PostgreSQL and they are scheduled for upgrade from v12. > > My test.sql looks like this (application uses prepared statements, the two > set operations are to revert for the current session the already > implemented fixes): > > prepare my_query (timestamp,bigint) as SELECT t.input_sequence, > rec_tro.output_sequence, r.input_sequence, rpl_rec_tro.output_sequence, > rpl_snd_tro.output_sequence, snd_tro.output_sequence, t.msg_type FROM > inputrequest t LEFT JOIN outputrequest rec_tro ON rec_tro.input_sequence = > t.input_sequence LEFT JOIN inputrequest r ON r.originalRequest_id = > t.input_sequence LEFT JOIN outputrequest rpl_rec_tro ON > rpl_rec_tro.input_sequence = r.input_sequence LEFT JOIN outputrequest > rpl_snd_tro ON rpl_snd_tro.reply_input_sequence = r.input_sequence LEFT > JOIN outputrequest snd_tro ON snd_tro.reply_input_sequence = > t.input_sequence WHERE t.receive_time < $1 AND t.input_sequence < $2 AND > t.msg_status IN ('COMPLETED', 'REJECTED') ORDER BY t.msg_status DESC, > t.input_sequence ; > > EXPLAIN EXECUTE my_query('2024-05-17 00:00:00', 202406020168279904); > > I have an explanation for what I suspected was a memory leak. It seems that systemd reports cached memory, the relevant par
Plan regression from 12 to 13
Hi, I have a query (generated by RequestTracker) that has regressed after upgrading from postgresql 12 to 13. The strange thing is that decreasing the effective_cache_size causes postgresql 13 to generate an acceptable plan. My setup is: 1 VM, 4 cores, 16 GB RAM (server has around 10 GB for filesystem cache) shared_buffers 2.4 GB With effective_cache_size 8G, I have the following plan: https://explain.depesz.com/s/lIcV (the run time is after several tries, with most of the data in memory; the initial run reading from disk is around 10 times slower) If I decrease the effective_cache_size to '2GB' or less, I get a much better plan: https://explain.depesz.com/s/Z6C7W The old VM running postgresql 12 (with a somewhat smaller database) always generate a good plan,regardless of the effective_cache_size : https://explain.depesz.com/s/iawn Are there any defaults that have changed from 12 to 13 that would explain the different plans? Thanks, Radu