query pg_stat_ssl hang 100%cpu

2023-09-05 Thread James Pang (chaolpan)
Hi,
PGv14.8, OS RHEL8, no SSL enabled in this database, we have a lot of client 
sessions who check it's ssl state by  query, all other sessions got done very 
quickly, but only 1 session hang there in 100% cpu. It looks like abnormal.

   select ssl from pg_stat_ssl where pid=pg_backend_pid();

testdb=#  select 
pid,usename,application_name,query_start,xact_start,state_change,wait_event_type,state,query
 from pg_stat_activity where pid=1245344;
   pid   | usename |application_name| query_start  |
  xact_start   | state_change | wait_event_type |
state  | query
-+-++--+---+--+-+
+
1245344 | test| PostgreSQL JDBC Driver | 2023-09-03 02:36:23.40238+00 | 
2023-09-03 02:36:23.402331+00 | 2023-09-03 02:36:23.40238+00 | |
active | select ssl from pg_stat_ssl where pid=pg_backend_pid()
(1 row)

testdb=#  select 
pid,usename,application_name,query_start,xact_start,state_change,wait_event_type,state,query
 from pg_stat_activity where pid=1245344;
   pid   | usename |application_name| query_start  |
  xact_start   | state_change | wait_event_type |
state  | query
-+-++--+---+--+-+
+
1245344 | test| PostgreSQL JDBC Driver | 2023-09-03 02:36:23.40238+00 | 
2023-09-03 02:36:23.402331+00 | 2023-09-03 02:36:23.40238+00 | |
active | select ssl from pg_stat_ssl where pid=pg_backend_pid()
(1 row)

testdb=#  select 
pid,usename,application_name,query_start,xact_start,state_change,wait_event_type,state,query
 from pg_stat_activity where pid=1245344;
   pid   | usename |application_name| query_start  |
  xact_start   | state_change | wait_event_type |
state  | query
-+-++--+---+--+-+
+
1245344 | test| PostgreSQL JDBC Driver | 2023-09-03 02:36:23.40238+00 | 
2023-09-03 02:36:23.402331+00 | 2023-09-03 02:36:23.40238+00 | |
active | select ssl from pg_stat_ssl where pid=pg_backend_pid()
(1 row)

PID USER  PR  NIVIRTRESSHR S  %CPU  %MEM TIME+ nMaj 
nMin WCHAN COMMAND
1245344 postgres  20   0   32.5g  12468  12164 R  99.5   0.0   4219:120 
1343 - postgres: test testdb 10.250.193.40(48282) BIND

#0  ensure_record_cache_typmod_slot_exists (typmod=0) at typcache.c:1714
#1  0x0091185b in assign_record_type_typmod (tupDesc=, 
tupDesc@entry=0x27bc738) at typcache.c:2001
#2  0x0091df03 in internal_get_result_type (funcid=, 
call_expr=, rsinfo=,
resultTypeId=, resultTupleDesc=0x7ffc9dff8cd0) at 
funcapi.c:393
#3  0x0091e263 in get_expr_result_type (expr=expr@entry=0x2792798, 
resultTypeId=resultTypeId@entry=0x7ffc9dff8ccc,
resultTupleDesc=resultTupleDesc@entry=0x7ffc9dff8cd0) at funcapi.c:230
#4  0x006a2fa5 in ExecInitFunctionScan (node=node@entry=0x273afa8, 
estate=estate@entry=0x269e948, eflags=eflags@entry=16) at nodeFunctionscan.c:370
#5  0x0069084e in ExecInitNode (node=node@entry=0x273afa8, 
estate=estate@entry=0x269e948, eflags=eflags@entry=16) at execProcnode.c:255
#6  0x0068a96d in InitPlan (eflags=16, queryDesc=0x273b2d8) at 
execMain.c:936
#7  standard_ExecutorStart (queryDesc=0x273b2d8, eflags=16) at execMain.c:263
#8  0x7f67c2821d5d in pgss_ExecutorStart (queryDesc=0x273b2d8, 
eflags=) at pg_stat_statements.c:965
#9  0x007fc226 in PortalStart (portal=portal@entry=0x26848b8, 
params=params@entry=0x0, eflags=eflags@entry=0, snapshot=snapshot@entry=0x0)
at pquery.c:514
#10 0x007fa27f in exec_bind_message (input_message=0x7ffc9dff90d0) at 
postgres.c:1995
#11 PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7ffc9dff9370, 
dbname=, username=) at postgres.c:4552
#12 0x0077a4ea in BackendRun (port=, port=) at postmaster.c:4537
#13 BackendStartup (port=) at postmaster.c:4259
#14 ServerLoop () at postmaster.c:1745
#15 0x0077b363 in PostmasterMain (argc=argc@entry=5, 
argv=argv@entry=0x256abc0) at postmaster.c:1417
#16 0x004fec63 in main (argc=5, argv=0x256abc0) at main.c:209

Thanks,

James


query pg_stat_ssl hang 100%cpu

2023-09-05 Thread James Pang (chaolpan)
Hi,
PGv14.8, OS RHEL8, no SSL enabled in this database, we have a lot of client 
sessions who check it's ssl state by  query, all other sessions got done very 
quickly, but only 1 session hang there in 100% cpu tens of hours, even 
pg_terminate_backend does not make it stopped either.  It looks like abnormal.

   select ssl from pg_stat_ssl where pid=pg_backend_pid();

testdb=#  select 
pid,usename,application_name,query_start,xact_start,state_change,wait_event_type,state,query
 from pg_stat_activity where pid=1245344;
   pid   | usename |application_name| query_start  |
  xact_start   | state_change | wait_event_type |
state  | query
-+-++--+---+--+-+
+
1245344 | test| PostgreSQL JDBC Driver | 2023-09-03 02:36:23.40238+00 | 
2023-09-03 02:36:23.402331+00 | 2023-09-03 02:36:23.40238+00 | |
active | select ssl from pg_stat_ssl where pid=pg_backend_pid()
(1 row)

testdb=#  select 
pid,usename,application_name,query_start,xact_start,state_change,wait_event_type,state,query
 from pg_stat_activity where pid=1245344;
   pid   | usename |application_name| query_start  |
  xact_start   | state_change | wait_event_type |
state  | query
-+-++--+---+--+-+
+
1245344 | test| PostgreSQL JDBC Driver | 2023-09-03 02:36:23.40238+00 | 
2023-09-03 02:36:23.402331+00 | 2023-09-03 02:36:23.40238+00 | |
active | select ssl from pg_stat_ssl where pid=pg_backend_pid()
(1 row)

testdb=#  select 
pid,usename,application_name,query_start,xact_start,state_change,wait_event_type,state,query
 from pg_stat_activity where pid=1245344;
   pid   | usename |application_name| query_start  |
  xact_start   | state_change | wait_event_type |
state  | query
-+-++--+---+--+-+
+
1245344 | test| PostgreSQL JDBC Driver | 2023-09-03 02:36:23.40238+00 | 
2023-09-03 02:36:23.402331+00 | 2023-09-03 02:36:23.40238+00 | |
active | select ssl from pg_stat_ssl where pid=pg_backend_pid()
(1 row)

PID  USERPR  NIVIRTRESSHR  S  %CPU  %MEM 
TIME+   nMaj nMin WCHAN COMMAND
1245344 postgres  20   0   32.5g  12468  12164 R  99.5 0.0 4219:12  
  0 1343 - postgres: test testdb 10.250.193.40(48282) BIND

#0  ensure_record_cache_typmod_slot_exists (typmod=0) at typcache.c:1714
#1  0x0091185b in assign_record_type_typmod (tupDesc=, 
tupDesc@entry=0x27bc738) at typcache.c:2001
#2  0x0091df03 in internal_get_result_type (funcid=, 
call_expr=, rsinfo=,
resultTypeId=, resultTupleDesc=0x7ffc9dff8cd0) at 
funcapi.c:393
#3  0x0091e263 in get_expr_result_type (expr=expr@entry=0x2792798, 
resultTypeId=resultTypeId@entry=0x7ffc9dff8ccc,
resultTupleDesc=resultTupleDesc@entry=0x7ffc9dff8cd0) at funcapi.c:230
#4  0x006a2fa5 in ExecInitFunctionScan (node=node@entry=0x273afa8, 
estate=estate@entry=0x269e948, eflags=eflags@entry=16) at nodeFunctionscan.c:370
#5  0x0069084e in ExecInitNode (node=node@entry=0x273afa8, 
estate=estate@entry=0x269e948, eflags=eflags@entry=16) at execProcnode.c:255
#6  0x0068a96d in InitPlan (eflags=16, queryDesc=0x273b2d8) at 
execMain.c:936
#7  standard_ExecutorStart (queryDesc=0x273b2d8, eflags=16) at execMain.c:263
#8  0x7f67c2821d5d in pgss_ExecutorStart (queryDesc=0x273b2d8, 
eflags=) at pg_stat_statements.c:965
#9  0x007fc226 in PortalStart (portal=portal@entry=0x26848b8, 
params=params@entry=0x0, eflags=eflags@entry=0, snapshot=snapshot@entry=0x0)
at pquery.c:514
#10 0x007fa27f in exec_bind_message (input_message=0x7ffc9dff90d0) at 
postgres.c:1995
#11 PostgresMain (argc=argc@entry=1, argv=argv@entry=0x7ffc9dff9370, 
dbname=, username=) at postgres.c:4552
#12 0x0077a4ea in BackendRun (port=, port=) at postmaster.c:4537
#13 BackendStartup (port=) at postmaster.c:4259
#14 ServerLoop () at postmaster.c:1745
#15 0x0077b363 in PostmasterMain (argc=argc@entry=5, 
argv=argv@entry=0x256abc0) at postmaster.c:1417
#16 0x004fec63 in main (argc=5, argv=0x256abc0) at main.c:209

Thanks,

James


Re: Index bloat and REINDEX/VACUUM optimization for partial index

2023-09-05 Thread jayaprabhakar k
Thanks Maxim and Jeff.
1. Do you have any pointers to the killbits issue on hot standby slaves? We
do use a hot standby instance for many queries. So I want to learn more
about it.
2. I am now considering partitioning the table. I am curious if we can set
up partitions by mutable columns. More specifically, ,
where the status is mutable, and usually ends up in terminal states
(success, failure or aborted).

I could not find any documentation on the performance implication of
partitioning by mutable column, any guidance would be helpful. I had
previously underestimated the impact of index on a mutable column, so I
want to be cautious this time.





On Fri, 1 Sept 2023 at 11:02, Maxim Boguk  wrote:

> But anyway, PostgreSQL has features to prevent the index bloat from
>> becoming too severe of a problem, and you should figure out why they are
>> not working for you.  The most common ones I know of are 1) long open
>> snapshots preventing clean up, 2) all index scans being bitmap index scans,
>> which don't to micro-vacuuming/index hinting the way ordinary btree
>> index scans do, and 3) running the queries on a hot-standby, where index
>> hint bits must be ignored.  If you could identify and solve this issue,
>> then you wouldn't need to twist yourself into knots avoiding non-HOT
>> updates.
>>
>
> I am not sure that kill bits could be a complete fix for indexes with tens
> of millions dead entries and only a handful of live entries. As I
> understand the mechanics of killbits - they help to avoid excessive heap
> visibility checks for dead tuples, but tuples with killbit are still should
> be read from the index first. And with many millions of dead entries it
> isn't free.
>
> PS: ignoring killbits on hot standby slaves is a source of endless pain in
> many cases.
>
> --
> Maxim Boguk
> Senior Postgresql DBA
>
> Phone UA: +380 99 143 
> Phone AU: +61  45 218 5678
>
>