Vacuum full connection exhaustion
Hey folks, I noticed something weird, and not sure if this is the expected behaviour or not in PostgreSQL. So I am running Benchbase (a benchmark framework) with 50 terminals (50 concurrent connections). There are 2-3 additional connections, one for a postgres-exporter container for example. So far so good, and with a `max_connections` at 100 there is no problem. What happens is that if I execute manually `VACUUM FULL` the connections are exhausted. Also tried this with 150 `max_connections` to see if it just “doubles” the current connections, but as it turned out, it still exhausted all the connections until it reached `max_connections`. This was cross-checked, as the postgres-exporter could not connect, and I manually was not allowed to connect with `psql`. Is this expected or is this a bug? postgres-exporter logs: ``` sql: error: connection to server on socket "/var/run/postgresql/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already ```
Re: Vacuum full connection exhaustion
On Thu, Aug 8, 2024 at 5:18 AM Costa Alexoglou wrote: > Hey folks, > > I noticed something weird, and not sure if this is the expected behaviour > or not in PostgreSQL. > > So I am running Benchbase (a benchmark framework) with 50 terminals (50 > concurrent connections). > There are 2-3 additional connections, one for a postgres-exporter > container for example. > > So far so good, and with a `max_connections` at 100 there is no problem. > What happens is that if I execute manually `VACUUM FULL` > Off-topic, but... *WHY?? *It almost certainly does not do what you think it does. Especially if it's just "VACUUM FULL;" the connections are exhausted. > Connect to the relevant database and run this query. Don't disconnect, and keep running it over and over again as you run the "VACUUM FULL;". That'll tell you exactly what happens. select pid ,datname as db ,application_name as app_name ,case when client_hostname is not null then client_hostname else client_addr::text end AS client_name ,usename ,to_char((EXTRACT(epoch FROM now() - backend_start))/60.0, '99,999.00') as backend_min ,to_char(query_start, '-MM-DD HH24:MI:SS.MS') as "Query Start" ,to_char((EXTRACT(epoch FROM now() - query_start))/60.0, '99,999.00') as qry_min ,to_char(xact_start, '-MM-DD HH24:MI:SS.MS') as "Txn Start" ,to_char((EXTRACT(epoch FROM now() - xact_start)/60.0), '999.00') as txn_min ,state query from pg_stat_activity WHERE pid != pg_backend_pid() order by 6 desc; > > Also tried this with 150 `max_connections` to see if it just “doubles” the > current connections, but as it turned out, it still exhausted all the > connections until it reached `max_connections`. > Double it again? > > This was cross-checked, as the postgres-exporter could not connect, and I > manually was not allowed to connect with `psql`. > > Is this expected or is this a bug? > Depends on what you set these to: autovacuum_max_workers max_parallel_maintenance_workers max_parallel_workers max_parallel_workers_per_gather max_worker_processes -- Death to America, and butter sauce! Iraq lobster...
Re: Vacuum full connection exhaustion
On Thu, 8 Aug 2024 at 11:18, Costa Alexoglou wrote: ... > So I am running Benchbase (a benchmark framework) with 50 terminals (50 > concurrent connections). > There are 2-3 additional connections, one for a postgres-exporter container > for example. ... > So far so good, and with a `max_connections` at 100 there is no problem. What > happens is that if I execute manually `VACUUM FULL` the connections are > exhausted. > Also tried this with 150 `max_connections` to see if it just “doubles” the > current connections, but as it turned out, it still exhausted all the > connections until it reached `max_connections`. > This was cross-checked, as the postgres-exporter could not connect, and I > manually was not allowed to connect with `psql`. Have you tried to check where the connections are coming from and what are they doing? Apart from the max-paralell-worker stuff already commented by Ron in an scenario with a long live locking processes ( vacuum full ) combined with potentially aggresive connecting ( a benchmark tool ) I would verify the benchmark tool is not timing out and disconnecting improperly leaving connections hung up. Francisco Olarte.
Re: Vacuum full connection exhaustion
> On Aug 7, 2024, at 10:34, Costa Alexoglou wrote: > > Hey folks, > > I noticed something weird, and not sure if this is the expected behaviour or > not in PostgreSQL. > > So I am running Benchbase (a benchmark framework) with 50 terminals (50 > concurrent connections). > There are 2-3 additional connections, one for a postgres-exporter container > for example. > > So far so good, and with a `max_connections` at 100 there is no problem. What > happens is that if I execute manually `VACUUM FULL` the connections are > exhausted. VACUUM FULL takes an exclusive lock on the table that it is operating on. It's possible that a connection becomes blocked on that exclusive lock waiting for the VACUUM FULL to finish, the application sees the connection stopped and fires up another one (this is common in container-based applications), that one blocks... until all of the connections are full of queries waiting on that VACUUM FULL.
Re: Destination Table - Condition Amount 0
On 8/8/24 09:35, Anthony Apollis wrote: I need your assistance with an ETL process that runs every six months. Currently, we are in Calendar Year/FY 2025, which started in July 2024. The issue is with the "Condition Amount" for FY 2025. Data comes through correctly until the "Insert Data Into Task" step (please see the attached screenshot). However, it appears that the code responsible for the "Update PD Credit" step is not functioning as expected. image.png The destination table [Turkey - NCD Revenue Reporting Model_Table_Model] is receiving data for FY 2025, but the "*Condition Amount*" column contains zeros for 2025. Please see the attached sample data for reference. To help diagnose the issue, I have included the following: DDL for all three tables. Insert/Update scripts for the two tasks as depicted in the screenshot above. Sample raw data for the two tables. SSIS can also be found attached for better understanding Sample data for the "Insert Data Into Honestly I have not looked through all the files and probably won't. I did find this though: CASE WHEN NCD_EXCL."Order Quantity" IS NULL OR NCD_EXCL."Order Quantity" LIKE 'Excluded' THEN 0 WHEN NCD_EXCL."Order Quantity" LIKE 'NCD Valid' THEN NCD_AMOUNT."Condition Amount" ELSE 0 which looks suspicious in that it can turn a "Condition Amount" into 0. I would start by running the SELECT that this is part of and seeing what it does to the data and if that is what you want it to do. -- Adrian Klaver adrian.kla...@aklaver.com
Debugging set up for Postgres?
Hi all, We've spent about 12 hours trying to set up a working debugger for psql functions without success, trying both PGAdmin 4 v8.3 and DBeaver. The online searches turn up multiple instructions for Postgres 12 or older, however we are using 16. Our extensions include PostGIS and supporting extensions like SFCGAL. We can get the PGAdmin debugger to stop at a breakpoint, but as soon as it exits a function, PGAdmin crashes ("error when executing continue") and has to be restarted. We also tried direct debugging, but the inputs to our functions are often json objects, and there isn't a way to paste such an object into the Value field. My question is: is there a way to effectively step debug psql functions in Postgres 16 and how? Thanks for reading, James
Re: Debugging set up for Postgres?
Hi čt 8. 8. 2024 v 19:16 odesílatel James Creasy napsal: > Hi all, > > We've spent about 12 hours trying to set up a working debugger for psql > functions without success, trying both PGAdmin 4 v8.3 and DBeaver. > > The online searches turn up multiple instructions for Postgres 12 or > older, however we are using 16. Our extensions include PostGIS and > supporting extensions like SFCGAL. > > We can get the PGAdmin debugger to stop at a breakpoint, but as soon as it > exits a function, PGAdmin crashes ("error when executing continue") and has > to be restarted. > > We also tried direct debugging, but the inputs to our functions are often > json objects, and there isn't a way to paste such an object into the Value > field. > > My question is: is there a way to effectively step debug psql functions in > Postgres 16 and how? > Do you think plpgsql functions? There should not be any change if I know - you need to use PLdebugger extension, and then you can use plpgsql debugger. Maybe you use more extensions that use plpgsql debug API in one moment - plprofiler, or plpgsql_check. This is not supported (plpgsql_check is exception, but it should be loaded last). Regards Pavel > > Thanks for reading, > > James >
Re: Destination Table - Condition Amount 0
The same code bring in values for FY24, 23 etc. Dont understand why FY25's values are 0. On Thu, 8 Aug 2024 at 18:56, Adrian Klaver wrote: > On 8/8/24 09:35, Anthony Apollis wrote: > > I need your assistance with an ETL process that runs every six months. > > Currently, we are in Calendar Year/FY 2025, which started in July 2024. > > > > The issue is with the "Condition Amount" for FY 2025. Data comes through > > correctly until the "Insert Data Into Task" step (please see the > > attached screenshot). However, it appears that the code responsible for > > the "Update PD Credit" step is not functioning as expected. > > > > image.png > > > > > > The destination table [Turkey - NCD Revenue Reporting Model_Table_Model] > > is receiving data for FY 2025, but the "*Condition Amount*" column > > contains zeros for 2025. Please see the attached sample data for > reference. > > > > To help diagnose the issue, I have included the following: > > > > DDL for all three tables. > > > > Insert/Update scripts for the two tasks as depicted in the screenshot > above. > > > > Sample raw data for the two tables. > > > > SSIS can also be found attached for better understanding > > > > Sample data for the "Insert Data Into > > > > Honestly I have not looked through all the files and probably won't. I > did find this though: > > CASE > WHEN NCD_EXCL."Order Quantity" IS NULL OR NCD_EXCL."Order > Quantity" LIKE 'Excluded' THEN > 0 > WHEN NCD_EXCL."Order Quantity" LIKE 'NCD Valid' THEN > NCD_AMOUNT."Condition Amount" > ELSE > 0 > > which looks suspicious in that it can turn a "Condition Amount" into 0. > > I would start by running the SELECT that this is part of and seeing what > it does to the data and if that is what you want it to do. > > -- > Adrian Klaver > adrian.kla...@aklaver.com > >
Re: Column type modification in big tables
On Thu, Aug 8, 2024 at 1:06 AM sud wrote: > > > On Wed, Aug 7, 2024 at 5:00 PM Lok P wrote: > >> >> >> On Wed, Aug 7, 2024 at 4:51 PM sud wrote: >> >>> >>> >>> Others may correct but i think, If you don't have the FK defined on >>> these columns you can do below. >>> >>> >>> --Alter table add column which will be very fast within seconds as it >>> will just add it to the data dictionary. >>> >>> ALTER TABLE tab1 ADD COLUMN new_column1 NUMERIC(3), new_column2 >>> varchar2(3); >>> >>> >>> *-- Back populate the data partition wise and commit, if it's really >>> needed* >>> >>> UPDATE tab1_part1 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), >>> new_column2 = CAST(old_column2 AS varchar2(3)) ; >>> commit; >>> UPDATE tab1_part2 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), >>> new_column2 = CAST(old_column2 AS varchar2(3)) ; >>> commit; >>> UPDATE tab1_part3 SET new_column1 = CAST(old_column1 AS NUMERIC(3)), >>> new_column2 = CAST(old_column2 AS varchar2(3)) ; >>> commit; >>> . >>> >>> >>> *--Alter table drop old columns which will be very fast within seconds >>> as it will just drop it from the data dictionary.* >>> ALTER TABLE your_table DROP COLUMN old_column1, DROP COLUMN old_column2; >>> >> >> >> >> Thank you so much. >> >> I understand this will be the fastest possible way to achieve the column >> modification. >> >> But talking about the dropped column which will be sitting in the table >> and consuming storage space, Is it fine to leave as is or auto vacuum will >> remove the column values behind the scene and also anyway , once those >> partitions will be purged they will be by default purged. Is this >> understanding correct? >> >> And also will this have any impact on the partition maintenance which is >> currently done by pg_partman as because the template table is now different >> internally(not from outside though). Will it cause conflict because of >> those dropped columns from the main table? >> > > I think leaving the table as is after the dropping column will be fine for > you because your regular partition maintenance/drop will slowly purge the > historical partitions and eventually they will be removed. But if you > update those new columns with the old column values, then autovacuum should > also take care of removing the rows with older column values (which are > dead actually) . > > Not sure if pg_partman will cause any issue ,as because the table now has > the column data type/length changed. Others may confirm. > Thank you so much. Can anybody suggest any other possible way here. As, we also need to have the existing values be updated to the new column value here using update command (even if it will update one partition at a time). And as I see we have almost all the values in the column not null, which means it will update almost ~5billion rows across all the partitions. So my question is , is there any parameter(like work_mem,maintenance_work_mem etc) which we can set to make this update faster? or any other way to get this column altered apart from this method? >
Re: Destination Table - Condition Amount 0
> On 8 Aug 2024, at 20:07, Anthony Apollis wrote: > > The same code bring in values for FY24, 23 etc. Dont understand why FY25's > values are 0. If you use the same code for FY24, then either there’s some filter being applied somewhere that excludes FY25 (and probably beyond), or something in your data changed. My bet is on the latter. For example, in FY25 the value of NCD_EXCL."Order Quantity" is 'NCD Valid FY25’, which doesn’t match your LIKE expression. Even something like a trailing space to the value could be enough. Alban Hertroys -- There is always an exception to always.
Re: Column type modification in big tables
> On 8 Aug 2024, at 20:38, Lok P wrote: > > Thank you so much. > > Can anybody suggest any other possible way here. As, we also need to have the > existing values be updated to the new column value here using update command > (even if it will update one partition at a time). And as I see we have almost > all the values in the column not null, which means it will update almost > ~5billion rows across all the partitions. So my question is , is there any > parameter(like work_mem,maintenance_work_mem etc) which we can set to make > this update faster? > or any other way to get this column altered apart from this method? Just a wild thought here that I’m currently not able to check… Can you add views as partitions? They would be read-only of course, but that would allow you to cast the columns in your original partitions to the new format, while you can add any new partitions in the new format. I suspect it’s not allowed, but perhaps worth a try. Alban Hertroys -- There is always an exception to always.
Re: Destination Table - Condition Amount 0
On 8/8/24 11:07, Anthony Apollis wrote: The same code bring in values for FY24, 23 etc. Dont understand why FY25's values are 0. Because something changed. You will need to go through the process step by step and verify that the code is current with what is in the FY25 data. -- Adrian Klaver adrian.kla...@aklaver.com
Getting specific partition from the partition name
Hi , We are using postgres version 15.4. We have a range partition table and the partition naming convention is generated by pg_partman and is something like "table_name>_p_MM_DD". We have a requirement of extracting specific partitions ordered by the date criteria and also do some operations on that specific date. But I am struggling and it's not working as expected.I tried something as below but it's not working.Can somebody guide me here please. to_date( substring('table_part_p2024_08_08' from '_p(\d{4})_(\d{2})_(\d{2})'), '_MM_DD' ) < current_date or is there any ready-made data dictionary which will give us the order of the partitions by the date and we can get hold of the specific nth partition in that table? Regards Veem
Re: Column type modification in big tables
On Thu, Aug 8, 2024 at 2:39 PM Lok P wrote: > Can anybody suggest any other possible way here. > Sure - how about not changing the column type at all? > one of the columns from varchar(20) to varchar(2) ALTER TABLE foobar ADD CONSTRAINT twocharplease CHECK (length(mycol) <= 2) NOT VALID; > one of the columns from Number(10,2) to Numeric(8,2) ALTER TABLE foobar ADD CONSTRAINT eightprecision CHECK (mycol <= 10^8) NOT VALID; > two of the columns from varchar(20) to numeric(3) This one is trickier, as we don't know the contents, nor why it is going to numeric(3) - not a terribly useful data type, but let's roll with it and assume the stuff in the varchar is a number of some sort, and that we don't allow nulls: ALTER TABLE foobar ADD CONSTRAINT onekorless CHECK (mycol::numeric(3) is not null) NOT VALID; You probably want to check on the validity of the existing rows: see the docs on VALIDATE CONSTRAINT here: https://www.postgresql.org/docs/current/sql-altertable.html Cheers, Greg
Re: Getting specific partition from the partition name
_MM_DD is already setup for sorting, so just do: SELECT table_name FROM information_schema.tables WHERE table_name ~ 'table_part_p' ORDER BY 1 DESC; If you need to grab the numbers: SELECT substring('table_part_p2022_03_04' from '([\d_]+)$'); Cheers, Greg
Re: Vacuum full connection exhaustion
On Fri, 9 Aug 2024 at 02:12, Christophe Pettus wrote: > VACUUM FULL takes an exclusive lock on the table that it is operating on. > It's possible that a connection becomes blocked on that exclusive lock > waiting for the VACUUM FULL to finish, the application sees the connection > stopped and fires up another one (this is common in container-based > applications), that one blocks... until all of the connections are full of > queries waiting on that VACUUM FULL. I also imagine this is the cause. One way to test would be to do: BEGIN; LOCK TABLE ; and see if the connections pile up in a similar way to when the VACUUM FULL command is used. David
Re: Vacuum full connection exhaustion
On Thu, Aug 8, 2024 at 10:12 AM Christophe Pettus wrote: > > > > On Aug 7, 2024, at 10:34, Costa Alexoglou wrote: > > > > Hey folks, > > > > I noticed something weird, and not sure if this is the expected > behaviour or not in PostgreSQL. > > > > So I am running Benchbase (a benchmark framework) with 50 terminals (50 > concurrent connections). > > There are 2-3 additional connections, one for a postgres-exporter > container for example. > > > > So far so good, and with a `max_connections` at 100 there is no problem. > What happens is that if I execute manually `VACUUM FULL` the connections > are exhausted. > > VACUUM FULL takes an exclusive lock on the table that it is operating on. > It's possible that a connection becomes blocked on that exclusive lock > waiting for the VACUUM FULL to finish, the application sees the connection > stopped and fires up another one (this is common in container-based > applications), that one blocks... until all of the connections are full of > queries waiting on that VACUUM FULL. > > "I see a lock, so let's cause another one!" That's crazy. -- Death to America, and butter sauce. Iraq lobster!
Re: Vacuum full connection exhaustion
> On Aug 8, 2024, at 21:15, Ron Johnson wrote: > > "I see a lock, so let's cause another one!" That's crazy. It's more "Oh, look, I need a connection to service this web request, but my pool is empty, so I'll just fire up a new connection to the server," lather, rinse, repeat. Pretty common these days, sadly.
Re: Getting specific partition from the partition name
On Thu, Aug 8, 2024 at 4:46 PM Greg Sabino Mullane wrote: > _MM_DD is already setup for sorting, so just do: > > SELECT table_name FROM information_schema.tables WHERE table_name ~ > 'table_part_p' ORDER BY 1 DESC; > > If you need to grab the numbers: > > SELECT substring('table_part_p2022_03_04' from '([\d_]+)$'); > What if the partitions aren't all rationally named? There *must* be a pg_* table out there which contains the partition boundaries... -- Death to America, and butter sauce. Iraq lobster!