Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Sorry, it was typo. Bind variable is bigint only. Thanks On Fri, 8 Nov, 2024, 7:09 pm David Mullineux, wrote: > Just spotted a potential problem. The indexed column is a bigint. Are you, > in your prepared statement passing a string or a big int ? > I notice your plan is doing an implicit type conversion when you run it > manually. > Sometimes the wrong type will make it not use the index. > > On Fri, 8 Nov 2024, 03:07 ravi k, wrote: > >> Hi , >> >> Thanks for the suggestions. >> >> Two more observations: >> >> 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are >> accurate in postgres 16) if parameter sniffing happens the possibility of >> going to sequence scan is more right. >> >> 2) no blockings or IO issue during the time. >> >> 3) even with limit clause if touch all partitions also it could have been >> completed in milliseconds as this is just one record. >> >> 4) auto_explain in prod we cannot enable as this is expensive and with >> high TPS we may face latency issues and lower environment this issue cannot >> be reproduced,( this is happening out of Million one case) >> >> This looks puzzle to us, just in case anyone experianced pls share your >> experience. >> >> Regards, >> Ravi >> >> On Thu, 7 Nov, 2024, 3:41 am David Mullineux, wrote: >> >>> It might be worth eliminating the use of cached plans here. Is your app >>> using prepared statements at all? >>> Point is that if the optimizer sees the same prepared query , 5 times, >>> the it locks the plan that it found at that time. This is a good trade off >>> as it avoids costly planning-time for repetitive queries. But if you are >>> manually querying, the a custom plan will be generated anew. >>> A quick analyze of the table should reset the stats and invalidate any >>> cached plans. >>> This may not be your problem just worth eliminating it from the list of >>> potential causes. >>> >>> On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: >>> >>>> Hi Team, >>>> >>>> One of the queries, which retrieves a single record from a table with >>>> 16 hash partitions, is taking more than 10 seconds to execute. In contrast, >>>> when we run the same query manually, it completes within milliseconds. This >>>> issue is causing exhaustion of the application pools. Do we have any bugs >>>> in postgrs16 hash partitions? Please find the attached log, table, and >>>> execution plan. >>>> >>>> size of the each partitions : 300GB >>>> Index Size : 12GB >>>> >>>> Postgres Version : 16.x >>>> Shared Buffers : 75 GB >>>> Effective_cache : 175 GB >>>> Work _mem : 4MB >>>> Max_connections : 3000 >>>> >>>> OS : Ubuntu 22.04 >>>> Ram : 384 GB >>>> CPU : 64 >>>> >>>> Please let us know if you need any further information or if there are >>>> additional details required. >>>> >>>> >>>> Regards, >>>> Ram. >>>> >>>
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Hi , Thanks for the suggestions. Two more observations: 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are accurate in postgres 16) if parameter sniffing happens the possibility of going to sequence scan is more right. 2) no blockings or IO issue during the time. 3) even with limit clause if touch all partitions also it could have been completed in milliseconds as this is just one record. 4) auto_explain in prod we cannot enable as this is expensive and with high TPS we may face latency issues and lower environment this issue cannot be reproduced,( this is happening out of Million one case) This looks puzzle to us, just in case anyone experianced pls share your experience. Regards, Ravi On Thu, 7 Nov, 2024, 3:41 am David Mullineux, wrote: > It might be worth eliminating the use of cached plans here. Is your app > using prepared statements at all? > Point is that if the optimizer sees the same prepared query , 5 times, > the it locks the plan that it found at that time. This is a good trade off > as it avoids costly planning-time for repetitive queries. But if you are > manually querying, the a custom plan will be generated anew. > A quick analyze of the table should reset the stats and invalidate any > cached plans. > This may not be your problem just worth eliminating it from the list of > potential causes. > > On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: > >> Hi Team, >> >> One of the queries, which retrieves a single record from a table with 16 >> hash partitions, is taking more than 10 seconds to execute. In contrast, >> when we run the same query manually, it completes within milliseconds. This >> issue is causing exhaustion of the application pools. Do we have any bugs >> in postgrs16 hash partitions? Please find the attached log, table, and >> execution plan. >> >> size of the each partitions : 300GB >> Index Size : 12GB >> >> Postgres Version : 16.x >> Shared Buffers : 75 GB >> Effective_cache : 175 GB >> Work _mem : 4MB >> Max_connections : 3000 >> >> OS : Ubuntu 22.04 >> Ram : 384 GB >> CPU : 64 >> >> Please let us know if you need any further information or if there are >> additional details required. >> >> >> Regards, >> Ram. >> >
Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Thanks for the advice! I am planing to set session level! but before that one more observations noticed i.e One more table has same issue, which is having similar like hash partitions. And I scheduled manual analyze for all parent hash tables(thus all stats will update together). After this change I didn't noticed the issue, not sure does this addressed issue or not, just monitoring if this not works will set custom plan in session level. I have seen in SQL server parameter sniffing regularly but in postgres I never experienced. I am still wondering does this sniffing or not as from stats I didn't notice any sequence scan. Best, On Sat, 9 Nov, 2024, 3:40 pm David Mullineux, wrote: > Thanks for correction. At this point I would be trying to modify > plan_cache_mode > for the session which uses the bond variable. alter it so that > plan_cache_mode=force_custom_plan > One hypothesis is that, a bad plan got cached for that SQL pattern. > Obviously, when you run it *manually* you are always getting a *custom* > plan as it's not a prepared statement. > > > > > On Sat, 9 Nov 2024, 03:46 ravi k, wrote: > >> Sorry, it was typo. Bind variable is bigint only. >> >> Thanks >> >> On Fri, 8 Nov, 2024, 7:09 pm David Mullineux, wrote: >> >>> Just spotted a potential problem. The indexed column is a bigint. Are >>> you, in your prepared statement passing a string or a big int ? >>> I notice your plan is doing an implicit type conversion when you run it >>> manually. >>> Sometimes the wrong type will make it not use the index. >>> >>> On Fri, 8 Nov 2024, 03:07 ravi k, wrote: >>> >>>> Hi , >>>> >>>> Thanks for the suggestions. >>>> >>>> Two more observations: >>>> >>>> 1) no sequence scan noticed from pg_stat_user_tables ( hope stats are >>>> accurate in postgres 16) if parameter sniffing happens the possibility of >>>> going to sequence scan is more right. >>>> >>>> 2) no blockings or IO issue during the time. >>>> >>>> 3) even with limit clause if touch all partitions also it could have >>>> been completed in milliseconds as this is just one record. >>>> >>>> 4) auto_explain in prod we cannot enable as this is expensive and with >>>> high TPS we may face latency issues and lower environment this issue cannot >>>> be reproduced,( this is happening out of Million one case) >>>> >>>> This looks puzzle to us, just in case anyone experianced pls share your >>>> experience. >>>> >>>> Regards, >>>> Ravi >>>> >>>> On Thu, 7 Nov, 2024, 3:41 am David Mullineux, wrote: >>>> >>>>> It might be worth eliminating the use of cached plans here. Is your >>>>> app using prepared statements at all? >>>>> Point is that if the optimizer sees the same prepared query , 5 times, >>>>> the it locks the plan that it found at that time. This is a good trade >>>>> off >>>>> as it avoids costly planning-time for repetitive queries. But if you are >>>>> manually querying, the a custom plan will be generated anew. >>>>> A quick analyze of the table should reset the stats and invalidate any >>>>> cached plans. >>>>> This may not be your problem just worth eliminating it from the list >>>>> of potential causes. >>>>> >>>>> On Wed, 6 Nov 2024, 17:14 Ramakrishna m, wrote: >>>>> >>>>>> Hi Team, >>>>>> >>>>>> One of the queries, which retrieves a single record from a table with >>>>>> 16 hash partitions, is taking more than 10 seconds to execute. In >>>>>> contrast, >>>>>> when we run the same query manually, it completes within milliseconds. >>>>>> This >>>>>> issue is causing exhaustion of the application pools. Do we have any bugs >>>>>> in postgrs16 hash partitions? Please find the attached log, table, and >>>>>> execution plan. >>>>>> >>>>>> size of the each partitions : 300GB >>>>>> Index Size : 12GB >>>>>> >>>>>> Postgres Version : 16.x >>>>>> Shared Buffers : 75 GB >>>>>> Effective_cache : 175 GB >>>>>> Work _mem : 4MB >>>>>> Max_connections : 3000 >>>>>> >>>>>> OS : Ubuntu 22.04 >>>>>> Ram : 384 GB >>>>>> CPU : 64 >>>>>> >>>>>> Please let us know if you need any further information or if there >>>>>> are additional details required. >>>>>> >>>>>> >>>>>> Regards, >>>>>> Ram. >>>>>> >>>>>
Re: Commit Latency
Thanks for the suggestions! It looks the issue is happening at the time of wal creation, does wal_init_zero off is good option? Best On Wed, 5 Feb, 2025, 9:07 pm Álvaro Herrera, wrote: > Hello > > On 2025-Feb-05, Ramakrishna m wrote: > > > I have a system handling *300 TPS*, with resource usage *below 10%*. > > However, I’m noticing *commit latency of around 200ms* for *1% of > > transactions*, occasionally spiking to *1 second*. Since there is no > > significant *I/O pressure*, I’m trying to identify what else might be > > causing this in *PostgreSQL 16*. > > max_connections=8000 doesn't sound great -- how many of those are > active, typically, and how many are idle-in-transaction? And you have > autovacuum_naptime=5s ... which sounds rather dubious. Either somebody > with great expertise configured this very carefully, or the settings are > somewhat randomly chosen with little or no expert oversight. Do you > have monitoring on the amount of bloat on these database? Maybe you > should consider connection pooling and limit the number that are active, > for starters. > > Maybe have a look at whether pg_wait_sampling can give you more clues. > Some basic bloat monitoring is a prerequisite to any further performance > tweaking anyhow. > > -- > Álvaro HerreraBreisgau, Deutschland — > https://www.EnterpriseDB.com/ > Maybe there's lots of data loss but the records of data loss are also lost. > (Lincoln Yeoh) >