Replication Issues
Hi Team, I have configured replication using slot ,But it failed by throwing the ERROR *pg_basebackup: could not get transaction log end position from server: ERROR: requested WAL segment 00012C9D0085 has already been removed *, which is unexpected because i have created the slot on master first and then issued the base backup command from slave's end the command is *usr/lib/postgresql/9.5/bin/pg_basebackup -U user --max-rate='150 M' --progress --verbose --write-recovery-conf --status-interval='10 s' -D data_dir -h host_ip -p 5433* These are the settings on my master archive_mode=on archive_command='/bin/true' wal_keep_segments=512 max_wal_senders=4 Series of steps i have followed : 1) Enabled password less authentication between master and slave 2)created slot on master (assuming it will store wal's regardless of other settings) 3)started basebackup from slave's end 4)Issued checkpoint at master's end Here my concern is , slave should recover WAL from replication slot but why i got the above ERROR , Why slot removed the requested wal file , Could you please let me know the reason why it happened or did i miss something ? Thanks, Bhargav K
Re: Replication Issues
Ee are using 9.5 version, slot option -s is not available ,can I go with -X stream option ? On Fri 28 Sep, 2018, 12:01 PM Laurenz Albe, wrote: > bhargav kamineni wrote: > > Hi Team, > > > > I have configured replication using slot ,But it failed by throwing the > > ERROR pg_basebackup: could not get transaction log end position from > server: > > ERROR: requested WAL segment 00012C9D0085 has already been > removed , > > which is unexpected because i have created the slot on master first > > and then issued the base backup command from slave's end > > the command is > > usr/lib/postgresql/9.5/bin/pg_basebackup -U user --max-rate='150 M' > --progress --verbose --write-recovery-conf --status-interval='10 s' -D > data_dir -h host_ip -p 5433 > > These are the settings on my master > > archive_mode=on > > archive_command='/bin/true' > > wal_keep_segments=512 > > max_wal_senders=4 > > Series of steps i have followed : > > 1) Enabled password less authentication between master and slave > > 2)created slot on master (assuming it will store wal's regardless of > other settings) > > 3)started basebackup from slave's end > > 4)Issued checkpoint at master's end > > > > Here my concern is , slave should recover WAL from replication slot but > why i > > got the above ERROR , Why slot removed the requested wal file , Could > you please > > let me know the reason why it happened or did i miss something ? > > I guess your base backup took long enough for the required WAL segments to > be > removed by the time it was done. > > To prevent that, create a replication slot *before* you perform > pg_basebackup > and use the options "-S -X stream" of pg_basebackup. > > You then use the same slot in "recovery.conf". > > That way you cannot lose any WAL. > > Yours, > Laurenz Albe > -- > Cybertec | https://www.cybertec-postgresql.com > >
Slot issues
Hi Team, I am getting this ERROR while starting my second slave server PANIC: too many replication slots active before shutdown HINT: Increase max_replication_slots and try again. max_replication_slots on my master is 2 and one of them is already active for another slave, do i need to increase this parameter for the need of working of another slave ? if so whats the reason ? Thanks Bhargav K
Re: [External] Slot issues
postgres=# select * from pg_replication_slots ; slot_name | plugin | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn --++---++--+++--+--+--- replication_slot || physical || | t | 23185 | | | 7642/B07AC830 (1 row) On Mon, 15 Oct 2018 at 01:34, Vijaykumar Jain wrote: > from your master, can you give us the output of > > select * from pg_replication_slots; > > maybe some stray slot exists that you may not be aware of ? > > Regards, > Vijay > > > On Mon, Oct 15, 2018 at 1:08 AM Vijaykumar Jain > wrote: > >> ok my bad. >> >> i just set one of the 9.6.9 versions to have max_replication_slots =2 and >> still i had to slots active. >> >> ostgres=# table pg_replication_slots; >> slot_name| plugin | slot_type | datoid | database | >> active | active_pid | xmin | catalog_xmin | restart_lsn | >> confirmed_flush_lsn >> >> -++---++--+++--+--+-+- >> a02 || physical || | t | 13719 | >> | | 2/D3D0 | >> a03 || physical || | t | 13720 | >> | | 2/D3D0 | >> (2 rows) >> >> postgres=# show max_replication_slots; >> max_replication_slots >> --- >> 2 >> (1 row) >> >> >> yep it should work with 2. >> Regards, >> Vijay >> >> >> On Mon, Oct 15, 2018 at 1:02 AM Vijaykumar Jain >> wrote: >> >>> I guess max_replication_slots has to be greater than active replication >>> slots. >>> maybe you need to increase max_replication_slots to 3 to have 2 active >>> slots. >>> >>> Regards, >>> Vijay >>> >>> >>> On Mon, Oct 15, 2018 at 12:54 AM bhargav kamineni < >>> bhargavpostg...@gmail.com> wrote: >>> >>>> for 9.x. Default was 0. we have set max_replication_slot =2 , one is >>>> already being used , postgres should use second slot for another server >>>> but dono why its throwing error >>>> >>>> On Mon, 15 Oct 2018 at 00:32, Vijaykumar Jain >>>> wrote: >>>> >>>>> I guess max_replication_slots has to > current total slots in use. >>>>> (and not >= ) >>>>> >>>>> >>>>> https://www.postgresql.org/docs/10/static/runtime-config-replication.html >>>>> >>>>> https://github.com/postgres/postgres/blob/d6e98ebe375e115c29028f9bd090f0f7e07e2527/src/backend/replication/slot.c#L1506 >>>>> >>>>> from the doc, it says defaults are 10, any reason why you have set it >>>>> to 2. You need to set it at a value higher than number of slots active. >>>>> also, i guess changing this parameter would require a server restart. >>>>> >>>>> >>>>> >>>>> >>>>> Regards, >>>>> Vijay >>>>> >>>>> >>>>> On Mon, Oct 15, 2018 at 12:16 AM bhargav kamineni < >>>>> bhargavpostg...@gmail.com> wrote: >>>>> >>>>>> Hi Team, >>>>>> >>>>>> I am getting this ERROR while starting my second slave server >>>>>> >>>>>> PANIC: too many replication slots active before shutdown >>>>>> HINT: Increase max_replication_slots and try again. >>>>>> >>>>>> max_replication_slots on my master is 2 and one of them is already >>>>>> active for another slave, >>>>>> do i need to increase this parameter for the need of working of >>>>>> another slave ? if so whats the reason ? >>>>>> >>>>>> >>>>>> >>>>>> Thanks >>>>>> Bhargav K >>>>>> >>>>>
Re: Slot issues
I got his on standby, could you please explain in detail about --*but that *on the standby* haven't set max_replication_slots high enough* . Thanks, Bhargav On Mon, 15 Oct 2018 at 01:50, Andres Freund wrote: > Hi, > > On 2018-10-15 00:15:53 +0530, bhargav kamineni wrote: > > Hi Team, > > > > I am getting this ERROR while starting my second slave server > > > > PANIC: too many replication slots active before shutdown > > HINT: Increase max_replication_slots and try again. > > > > max_replication_slots on my master is 2 and one of them is already active > > for another slave, > > do i need to increase this parameter for the need of working of another > > slave ? if so whats the reason ? > > You're getting that error on the standby, not the primary, right? In > all likelihood the problem is that you copied over replication slots to > your standby server, but that *on the standby* haven't set > max_replication_slots high enough. > > Greetings, > > Andres Freund >
Re: [External] Re: Slot issues
pg_replslot # ls -lrth total 4.0K drwx-- 2 postgres postgres 4.0K Oct 14 14:55 replication_slot On Mon, 15 Oct 2018 at 02:11, Vijaykumar Jain wrote: > ok wait, you data directory path may be different :) > i mean wherever your pg data dir is, from there find pg_replslot folder > and list the contents of it. > > Regards, > Vijay > > > On Mon, Oct 15, 2018 at 2:10 AM Vijaykumar Jain > wrote: > >> i guess he is trying to confirm if the value of max_replication_slot is >> the same on all the master and the standby nodes. >> >> also, >> >> I am trying to shoot in the dark. >> >> can you give the output of (on master) >> ls /var/lib/postgresql//main/pg_replslot/* >> >> also the value of max_replication_slot on the master and all the >> standby's. >> also, how are you creating the replication slot? >> >> can you share the query? >> >> also pls share the content of the recovery.conf on the standby nodes >> (both currently running and currently not running). >> you can scribble company secret stuff. >> >> >> Regards, >> Vijay >> >> >> On Mon, Oct 15, 2018 at 1:58 AM bhargav kamineni < >> bhargavpostg...@gmail.com> wrote: >> >>> I got his on standby, could you please explain in detail about >>> --*but that *on the standby* haven't set max_replication_slots high >>> enough*. >>> >>> Thanks, >>> Bhargav >>> >>> On Mon, 15 Oct 2018 at 01:50, Andres Freund wrote: >>> >>>> Hi, >>>> >>>> On 2018-10-15 00:15:53 +0530, bhargav kamineni wrote: >>>> > Hi Team, >>>> > >>>> > I am getting this ERROR while starting my second slave server >>>> > >>>> > PANIC: too many replication slots active before shutdown >>>> > HINT: Increase max_replication_slots and try again. >>>> > >>>> > max_replication_slots on my master is 2 and one of them is already >>>> active >>>> > for another slave, >>>> > do i need to increase this parameter for the need of working of >>>> another >>>> > slave ? if so whats the reason ? >>>> >>>> You're getting that error on the standby, not the primary, right? In >>>> all likelihood the problem is that you copied over replication slots to >>>> your standby server, but that *on the standby* haven't set >>>> max_replication_slots high enough. >>>> >>>> Greetings, >>>> >>>> Andres Freund >>>> >>>
Re: Slot issues
Yeah i have used rsync , Got it now will increase the max_replication_slots to high enough , Thank you Andres Freund :-) On Mon, 15 Oct 2018 at 02:40, Andres Freund wrote: > Hi, > > Please try to quote properly. > > On 2018-10-15 01:57:53 +0530, bhargav kamineni wrote: > > I got his on standby, could you please explain in detail about > > --*but that *on the standby* haven't set max_replication_slots high > enough* > > . > > What is max_replication_slots set to on the new standby? > > If you created the new basebackup using rsync, and didn't exclude > pg_replication_slot, it'll have copied the slots from the primary. And > thus needs a high enough max_replication_slots to work with them. > > - Andres >
Re: Slot issues
> You probably afterwards want to drop those slots from the new standby. > See the following section from the docs: > https://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA "It is often a good idea to also omit from the backup the files within the cluster's pg_replslot/ directory, so that replication slots that exist on the master do not become part of the backup. Otherwise, the subsequent use of the backup to create a standby may result in indefinite retention of WAL files on the standby, and possibly bloat on the master if hot standby feedback is enabled, because the clients that are using those replication slots will still be connecting to and updating the slots on the master, not the standby. Even if the backup is only intended for use in creating a new master, copying the replication slots isn't expected to be particularly useful, since the contents of those slots will likely be badly out of date by the time the new master comes on line." Since i already synced the pg_repslot to standby ,Is it okay if i remove the pg_repslot directory befor starting postgresql service ? On Mon, 15 Oct 2018 at 02:54, Andres Freund wrote: > Hi, > > As I just wrote: > > On Mon, 15 Oct 2018 at 02:40, Andres Freund wrote: > > > Please try to quote properly. > > On 2018-10-15 02:45:51 +0530, bhargav kamineni wrote: > > Yeah i have used rsync , Got it now will increase the > max_replication_slots > > to high enough , Thank you Andres Freund :-) > > You probably afterwards want to drop those slots from the new standby. > See the following section from the docs: > > > https://www.postgresql.org/docs/current/static/continuous-archiving.html#BACKUP-LOWLEVEL-BASE-BACKUP-DATA > "It is often a good idea to also omit from the backup the files within the > cluster's pg_replslot/ directory, so that replication slots that exist on > the master do not become part of the backup. Otherwise, the subsequent use > of the backup to create a standby may result in indefinite retention of WAL > files on the standby, and possibly bloat on the master if hot standby > feedback is enabled, because the clients that are using those replication > slots will still be connecting to and updating the slots on the master, not > the standby. Even if the backup is only intended for use in creating a new > master, copying the replication slots isn't expected to be particularly > useful, since the contents of those slots will likely be badly out of date > by the time the new master comes on line." > > Greetings, > > Andres Freund >
Extensions
Hi Team, I have a doubt regarding the extensions.My db has these many extensions Name --- adminpack btree_gist dblink file_fdw fuzzystrmatch hstore insert_username intarray pageinspect pg_buffercache pg_freespacemap pg_repack pg_stat_statements pg_trgm pgcrypto plpgsql unaccent How can i figure out what all extensions are being used ? is there any way ? Thanks, Bhargav
pg_terminate_backend not working
Hi Team , We are observing long running process hung in active state on one of our db from last two days with usename 'xyz', We have tried killing the respective pid with pg_terminate_backned(pid), it's returning true but the quires are not actually being terminated. Based on the client address and port number from pg_stat_activity we found the connection origin and killed that connection initiation process from that host , even after that the connections on db from 'xyz' are not getting terminated. and these connections are not making network calls a self dblink connection is being established in the query. as a workaround we have disabled the connections initiation script for xyz user and restarted the database right after restart we observed the new connections from xyz user to database which is not expected because as we have already disables the connections initiation script. select pg_postmaster_start_time(); pg_postmaster_start_time -- 2019-09-24 06:44:07.41594+00 select usename,backend_start from pg_stat_activity where now()-xact_start > '3 hours' and usename ='xyz'; usename | backend_start -+ xyz | 2019-09-24 06:44:38.879047+00 | xyz | 2019-09-24 06:44:38.880501+00 | xyz | 2019-09-24 06:44:38.881326+00 | xyz | 2019-09-24 06:44:38.877561+00 | xyz | 2019-09-24 06:44:38.878336+00 | what could be the possible reason for these connections and pg_termiante_backend(pid) not working ? Thanks, Bhargav K
Re: pg_terminate_backend not working
Any thoughts on this ? On Tue, 24 Sep 2019 at 16:44, bhargav kamineni wrote: > Hi Team , > > We are observing long running process hung in active state on one of our > db from last two days with usename 'xyz', We have tried killing the > respective pid with pg_terminate_backned(pid), it's returning true but the > quires are not actually being terminated. Based on the client address and > port number from pg_stat_activity we found the connection origin and > killed that connection initiation process from that host , even after that > the connections on db from 'xyz' are not getting terminated. > > and these connections are not making network calls a self dblink > connection is being established in the query. > > as a workaround we have disabled the connections initiation script for xyz > user and restarted the database right after restart we observed the new > connections from xyz user to database which is not expected because as we > have already disables the connections initiation script. > > select pg_postmaster_start_time(); >pg_postmaster_start_time > -- > 2019-09-24 06:44:07.41594+00 > > select usename,backend_start from pg_stat_activity where now()-xact_start > > '3 hours' and usename ='xyz'; > usename | backend_start > > -+ > xyz | 2019-09-24 06:44:38.879047+00 | > xyz | 2019-09-24 06:44:38.880501+00 | > xyz | 2019-09-24 06:44:38.881326+00 | > xyz | 2019-09-24 06:44:38.877561+00 | > xyz | 2019-09-24 06:44:38.878336+00 | > > what could be the possible reason for these connections and > pg_termiante_backend(pid) not working ? > > Thanks, > Bhargav K > > >
Re: pg_terminate_backend not working
> Hi Team , > > We are observing long running process hung in active state on one of > our db from last two days with usename 'xyz', We have tried killing > the respective pid with pg_terminate_backned(pid), it's returning > true but the quires are not actually being terminated. Based on the > client address and port number from pg_stat_activity we found the > connection origin and killed that connection initiation process from > that host , even after that the connections on db from 'xyz' are > not getting terminated. >Your client backend is most likely sitting in a blocked sys call such as network send, etc. these queries are not doing network calls,they have loopback dblink connection with in. >Not sure though what is relaunching them after you say they were disabled some{where,how}. Is there any workaround to terminate those stuck process apart from restarting the database ? On Wed, 25 Sep 2019 at 00:05, Jerry Sievers wrote: > bhargav kamineni writes: > > > Hi Team , > > > > We are observing long running process hung in active state on one of > > our db from last two days with usename 'xyz', We have tried killing > > the respective pid with pg_terminate_backned(pid), it's returning > > true but the quires are not actually being terminated. Based on the > > client address and port number from pg_stat_activity we found the > > connection origin and killed that connection initiation process from > > that host , even after that the connections on db from 'xyz' are > > not getting terminated. > > Your client backend is most likely sitting in a blocked sys call such as > network send, etc. > > Not sure though what is relaunching them after you say they were > disabled some{where,how}. > > HTH > > > > > and these connections are not making network calls a self dblink > > connection is being established in the query. > > > > as a workaround we have disabled the connections initiation script > > for xyz user and restarted the database right after restart we > > observed the new connections from xyz user to database which is not > > expected because as we have already disables the connections > > initiation script. > > > > select pg_postmaster_start_time(); > >pg_postmaster_start_time > > -- > > 2019-09-24 06:44:07.41594+00 > > > > select usename,backend_start from pg_stat_activity where now() > > -xact_start > '3 hours' and usename ='xyz'; > > usename | backend_start > > > > -+ > > xyz | 2019-09-24 06:44:38.879047+00 | > > xyz | 2019-09-24 06:44:38.880501+00 | > > xyz | 2019-09-24 06:44:38.881326+00 | > > xyz | 2019-09-24 06:44:38.877561+00 | > > xyz | 2019-09-24 06:44:38.878336+00 | > > > > what could be the possible reason for these connections and > > pg_termiante_backend(pid) not working ? > > > > Thanks, > > Bhargav K > > > > > > > > > > -- > Jerry Sievers > Postgres DBA/Development Consulting > e: postgres.consult...@comcast.net >
PMChildFlags array
Hi, Observed below errors in logfile 2019-09-20 02:00:24.504 UTC,,,99779,,5d73303a.185c3,73,,2019-09-07 04:21:14 UTC,,0,FATAL,XX000,"no free slots in PMChildFlags array","" 2019-09-20 02:00:24.505 UTC,,,109949,,5d8432b8.1ad7d,1,,2019-09-20 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment ""/PostgreSQL.2520932"": No such file or directory","" 2019-09-20 02:00:24.505 UTC,,,109950,,5d8432b8.1ad7e,1,,2019-09-20 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment ""/PostgreSQL.2520932"": No such file or directory","" what could be the possible reasons for this to occur and is there any chance of database corruption after this event ? Regards, Bhargav
Re: PMChildFlags array
Any suggestions on this ? On Thu, 3 Oct 2019 at 16:27, bhargav kamineni wrote: > Hi, > > Observed below errors in logfile > > 2019-09-20 02:00:24.504 UTC,,,99779,,5d73303a.185c3,73,,2019-09-07 > 04:21:14 UTC,,0,FATAL,XX000,"no free slots in PMChildFlags array","" > 2019-09-20 02:00:24.505 UTC,,,109949,,5d8432b8.1ad7d,1,,2019-09-20 > 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment > ""/PostgreSQL.2520932"": No such file or directory","" > 2019-09-20 02:00:24.505 UTC,,,109950,,5d8432b8.1ad7e,1,,2019-09-20 > 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment > ""/PostgreSQL.2520932"": No such file or directory","" > > > what could be the possible reasons for this to occur and is there any > chance of database corruption after this event ? > > > Regards, > Bhargav > >
Re: PMChildFlags array
> Hi, > > Observed below errors in logfile > > 2019-09-20 02:00:24.504 UTC,,,99779,,5d73303a.185c3,73,,2019-09-07 > 04:21:14 UTC,,0,FATAL,XX000,"no free slots in PMChildFlags array","" > 2019-09-20 02:00:24.505 UTC,,,109949,,5d8432b8.1ad7d,1,,2019-09-20 > 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment > ""/PostgreSQL.2520932"": No such file or directory","" > 2019-09-20 02:00:24.505 UTC,,,109950,,5d8432b8.1ad7e,1,,2019-09-20 > 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment > ""/PostgreSQL.2520932"": No such file or directory","" > >Postgres version? PostgreSQL 10.8 >OS and version? NAME="Ubuntu" VERSION="18.04.1 LTS (Bionic Beaver)" What was the database doing just before the FATAL line? Postgres was rejecting a bunch of connections from a user who is having a connection limit set. that was the the FATAL error that i could see in log file. FATAL,53300,"too many connections for role ""user_app""" db=\du user_app List of roles Role name | Attributes | Member of --+---+ user_app | No inheritance +| {application_role} | 100 connections +| | Password valid until infinity | > what could be the possible reasons for this to occur and is there any > chance of database corruption after this event ? The source(backend/storage/ipc/pmsignal.c ) says: "/* Out of slots ... should never happen, else postmaster.c messed up */ elog(FATAL, "no free slots in PMChildFlags array"); " Someone else will need to comment on what 'messed up' could be On Thu, 3 Oct 2019 at 18:56, Adrian Klaver wrote: > On 10/3/19 3:57 AM, bhargav kamineni wrote: > > Hi, > > > > Observed below errors in logfile > > > > 2019-09-20 02:00:24.504 UTC,,,99779,,5d73303a.185c3,73,,2019-09-07 > > 04:21:14 UTC,,0,FATAL,XX000,"no free slots in PMChildFlags > array","" > > 2019-09-20 02:00:24.505 UTC,,,109949,,5d8432b8.1ad7d,1,,2019-09-20 > > 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment > > ""/PostgreSQL.2520932"": No such file or directory","" > > 2019-09-20 02:00:24.505 UTC,,,109950,,5d8432b8.1ad7e,1,,2019-09-20 > > 02:00:24 UTC,,0,ERROR,58P01,"could not open shared memory segment > > ""/PostgreSQL.2520932"": No such file or directory","" > > > > Postgres version? > > OS and version? > > What was the database doing just before the FATAL line? > > > what could be the possible reasons for this to occur and is there any > > chance of database corruption after this event ? > > The source(backend/storage/ipc/pmsignal.c ) says: > > "/* Out of slots ... should never happen, else postmaster.c messed up */ > elog(FATAL, "no free slots in PMChildFlags array"); > " > > Someone else will need to comment on what 'messed up' could be. > > > > > > > Regards, > > Bhargav > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com >
Re: PMChildFlags array
bhargav kamineni writes: > Postgres was rejecting a bunch of connections from a user who is having a > connection limit set. that was the the FATAL error that i could see in log > file. > FATAL,53300,"too many connections for role ""user_app""" > db=\du user_app >List of roles > Role name | Attributes | Member of > --+---+ > user_app | No inheritance +| {application_role} > | 100 connections +| > | Password valid until infinity | >Hm, what's the overall max_connections limit? (I'm wondering in particular if it's more or less than 100.) its set to 500; show max_connections ; max_connections - 500 On Thu, 3 Oct 2019 at 22:52, Tom Lane wrote: > bhargav kamineni writes: > > Postgres was rejecting a bunch of connections from a user who is having a > > connection limit set. that was the the FATAL error that i could see in > log > > file. > > FATAL,53300,"too many connections for role ""user_app""" > > > db=\du user_app > >List of roles > > Role name | Attributes | Member of > > --+---+ > > user_app | No inheritance +| {application_role} > > | 100 connections +| > > | Password valid until infinity | > > Hm, what's the overall max_connections limit? (I'm wondering > in particular if it's more or less than 100.) > > regards, tom lane >
Re: PMChildFlags array
Thanks Tom Lane for detailing the issue. >So ... how many is "a bunch"? more than 85 >Looking at the code, it seems like it'd be possible for a sufficiently >aggressive spawner of incoming connections to reach the >MaxLivePostmasterChildren limit. While the postmaster would correctly >reject additional connection attempts after that, what it would not do >is ensure that any child slots are left for new parallel worker processes. >So we could hypothesize that the error you're seeing in the log is from >failure to spawn a parallel worker process, due to being out of child >slots. Thanks Tom Lane for detailing the issue. we have enabled "max_parallel_workers_per_gather = 4". 20 days before we ran into this issue . >However, given that max_connections = 500, MaxLivePostmasterChildren() >would be 1000-plus. This would mean that reaching this condition would >require *at least* 500 concurrent connection-attempts-that-haven't-yet- >been-rejected, maybe well more than that if you didn't have close to >500 legitimately open sessions. That seems like a lot, enough to suggest >that you've got some pretty serious bug in your client-side logic. below errors observed after crash in postgres logfile : ERROR: xlog flush request is not satisfied for couple of tables , we have initiated the vacuum full on those tables and the error went off after that. ERROR: right sibling's left-link doesn't match: block 273660 links to 273500 instead of expected 273661 in index -- observed this error while doing vacuum freeze on databsase , we have dropped this index and created a new one Observations : Vacuum freeze analyze job is getting stuck at database end which is initiated thru cronjob, pg_cancel_backend(), pg_termiante_backend() is not able to terminate those stuck process , Restarting the database only able to clear those process , i am thinking this is happening due to corruption (if this is true how can i detect this ? pg_dump ?). is there any way to overcome this problem ? does migrating the database to a new instance (pg_basebackup and switching over to new instance ) solves this issue ? Anyway, I think it's clearly a bug that canAcceptConnections() thinks the number of acceptable connections is identical to the number of allowed child processes; it needs to be less, by the number of background processes we want to support. But it seems like a darn hard-to-hit bug, so I'm not quite sure that that explains your observation. On Fri, 4 Oct 2019 at 03:49, Tom Lane wrote: > bhargav kamineni writes: > >> What was the database doing just before the FATAL line? > > > Postgres was rejecting a bunch of connections from a user who is having a > > connection limit set. that was the the FATAL error that i could see in > log > > file. > > FATAL,53300,"too many connections for role ""user_app""" > > So ... how many is "a bunch"? > > Looking at the code, it seems like it'd be possible for a sufficiently > aggressive spawner of incoming connections to reach the > MaxLivePostmasterChildren limit. While the postmaster would correctly > reject additional connection attempts after that, what it would not do > is ensure that any child slots are left for new parallel worker processes. > So we could hypothesize that the error you're seeing in the log is from > failure to spawn a parallel worker process, due to being out of child > slots. > > However, given that max_connections = 500, MaxLivePostmasterChildren() > would be 1000-plus. This would mean that reaching this condition would > require *at least* 500 concurrent connection-attempts-that-haven't-yet- > been-rejected, maybe well more than that if you didn't have close to > 500 legitimately open sessions. That seems like a lot, enough to suggest > that you've got some pretty serious bug in your client-side logic. > > Anyway, I think it's clearly a bug that canAcceptConnections() thinks the > number of acceptable connections is identical to the number of allowed > child processes; it needs to be less, by the number of background > processes we want to support. But it seems like a darn hard-to-hit bug, > so I'm not quite sure that that explains your observation. > > regards, tom lane >
Re: OID out of range
Any workaround to make it work ? On Thu, 16 Jan 2020 at 00:00, Adrian Klaver wrote: > On 1/15/20 10:18 AM, Konireddy Rajashekar wrote: > > I am getting ERROR: OID out of range while firing below SQL , what could > > be the reason? I am joining pg_class and one user_created table to > > compare tables size. > > > > select u.relid,c.relnamespace::regnamespace::text,c.relname,now() as > current_time,pg_size_pretty(pg_relation_size(c.oid)) as > current_size,pg_size_pretty(u.table_size) as > previous_size,pg_size_pretty(pg_relation_size(c.oid) - > pg_relation_size(u.table_size)) as diff from user_tables_sizes u join > pg_class c on u.relid::bigint = c.oid::bigint where c.relkind='r' and > c.relnamespace::regnamespace::text='rpx_reporting_stage' and > u.captured_dt::date=current_date - interval '1 days'ERROR: OID out of > rangetest-# \d user_tables_sizes > > Given this: > > https://www.postgresql.org/docs/12/datatype-oid.html > "The oid type is currently implemented as an unsigned four-byte integer. " > > I am pretty sure this: > > c.oid::bigint > > is the problem. > > -- > Adrian Klaver > adrian.kla...@aklaver.com > > >
oldest xmin is far in the past :: BUT xmin is not available in system
Hi Team, It seems vacuum is behaving somewhat weird on postgres database , observing below HINTS on the vacuum logs WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. Below is the auto-vacuum status on the bloated tables: => SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum FROM pg_stat_all_tables ORDER BY n_dead_tup / (n_live_tup * current_setting('autovacuum_vacuum_scale_factor')::float8 + current_setting('autovacuum_vacuum_threshold')::float8) DESC LIMIT 10; schemaname | relname | n_live_tup | n_dead_tup | last_autovacuum +-+++--- pg_catalog | pg_statistic|136 | 37563 | 2022-04-18 04:00:21.045089+00 public | test1 | 209405206 | 126752908 | 2022-04-18 03:59:43.013758+00 public | test2 | 513770985 | 49258312 | 2022-04-18 04:00:23.24043+00 public | test3 | 90853150 |4090146 | 2022-04-18 04:00:25.868147+00 pg_catalog | pg_shdepend |153 | 29 | 2022-04-08 12:16:02.816631+00 pg_catalog | pg_index| 73 | 18 | pg_toast | pg_toast_2619 | 16 | 12 | 2022-03-13 23:01:54.334003+00 pg_catalog | pg_class|425 | 19 | 2022-03-01 13:15:57.534378+00 pg_catalog | pg_proc | 2457 | 48 | pg_toast | pg_toast_2618 |252 | 10 | i tried to vacuum the the first table pg_statistic , Below is the log postgres=> VACUUM (VERBOSE) pg_statistic; WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. INFO: aggressively vacuuming "pg_catalog.pg_statistic" INFO: "pg_statistic": found 0 removable, 37699 nonremovable row versions in 6331 out of 6351 pages DETAIL: 37563 dead row versions cannot be removed yet, oldest xmin: 648320155 There were 3340 unused item identifiers. Skipped 0 pages due to buffer pins, 20 frozen pages. 0 pages are entirely empty. CPU: user: 0.01 s, system: 0.00 s, elapsed: 0.01 s. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. INFO: aggressively vacuuming "pg_toast.pg_toast_2619" INFO: "pg_toast_2619": found 0 removable, 16 nonremovable row versions in 3 out of 11 pages DETAIL: 12 dead row versions cannot be removed yet, oldest xmin: 648320155 There were 11 unused item identifiers. Skipped 0 pages due to buffer pins, 8 frozen pages. 0 pages are entirely empty. CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s. VACUUM Table is getting vacuumed but not able to remove the dead tuples because of *oldest xmin: 648320155* , but the mentioned xim is not associated with long running quries or stale replication slots or prepared transactions. *Long running:* postgres=> SELECT now()-query_start,pid, datname, usename, state, backend_xmin FROM pg_stat_activity WHERE backend_xmin IS NOT NULL ORDER BY age(backend_xmin) DESC; ?column? | pid | datname | usename | state | backend_xmin -+---+---+-++-- 00:00:29.910155 | 539 | postgres | | active |832858371 00:00:23.766305 | 1211 | postgres | | active |832858509 00:00:00.756961 | 2151 | postgres | | active |832859484 00:00:00.060784 | 30833 | postgres | root| active |832859508 00:00:00.004473 | 29270 | postgres | root| active |832859508 00:00:00.009809 | 29271 | postgres | root| active |832859508 00:00:00.015169 | 27145 | postgres | root| active |832859508 00:00:00| 1450 | postgres | postgres | active |832859508 00:00:00.010672 | 544 | postgres | root| active |832859508 00:00:00.034516 | 19940 | postgres | root| active |832859508 (10 rows) *stale replication slots:* postgres=> SELECT slot_name, slot_type, database, xmin FROM pg_replication_slots ORDER BY age(xmin) DESC; slot_name | slot_type | database | xmin ---+---+--+-- (0 rows) *Prepared transaction's :* postgres=> SELECT gid, prepared, owner, database, transaction AS xmin postgres-> FROM pg_prepared_xacts postgres-> ORDER BY age(transaction) DESC; gid | prepared | owner | database | xmin -+--+---+--+-- (0 rows) Checked for long running queries on replica side , but haven't found any postgres=> show hot_standby_feedback ; hot_stan
Re: oldest xmin is far in the past :: BUT xmin is not available in system
Other details: postgres=> select version(); version - PostgreSQL 13.5 on aarch64-unknown-linux-gnu, compiled by aarch64-unknown-linux-gnu-gcc (GCC) 7.4.0, 64-bit (1 row) postgres=> select aurora_version(); aurora_version 13.5.1 (1 row) postgres=> \l+ List of databases Name| Owner | Encoding | Collate |Ctype| Access privileges | Size| Tablespace |Description ---+--+--+-+-+---+---++ Postgres | root | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/root +| 361 GB| pg_default | | | | | | root=CTc/root+| || | | | | | pmm=CTc/root | || Test | root | UTF8 | en_US.UTF-8 | en_US.UTF-8 | | 8391 kB | pg_default | default administrative connection database rdsadmin | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin | No Access | pg_default | template0 | rdsadmin | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin +| 16 MB | pg_default | unmodifiable empty database | | | | | rdsadmin=CTc/rdsadmin | || template1 | root | UTF8 | en_US.UTF-8 | en_US.UTF-8 | root=CTc/root+| 8215 kB | pg_default | default template for new databases | | | | | =c/root | || (5 rows) executing the vacuum on the entire cluster is also giving the same HINTS and WARNING's WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past HINT: Close open transactions soon to avoid wraparound problems. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. WARNING: oldest xmin is far in the past Regards, BK On Tue, Apr 19, 2022 at 11:36 AM David G. Johnston < david.g.johns...@gmail.com> wrote: > > > On Monday, April 18, 2022, bhargav kamineni wrote: > >> Hi Team, >> >> It seems vacuum is behaving somewhat weird on postgres database , >> observing below HINTS on the vacuum logs >> >> WARNING: oldest xmin is far in the past >> >> HINT: Close open transactions soon to avoid wraparound problems. >> >> You might also need to commit or roll back old prepared transactions, or >> drop stale replication slots. >> >> >> What version? > > What other databases are present? > > Others can give better (more detailed/nuanced) guidance but if you can > just start vacuuming every table in every database manually, you probably > should just do that. Vacuum freeze specifically. > > David J. > >
Tool for comparing parameters
Hi Team, Is there any opensource or paid tool that compares postgresql database parameters? (NOT SCHEMA) . Thanks Bhargav K
order of reading the conf files
Hi, may i know the order in which postgres reads the configuration files like conf , auto.conf , hba ? and how does postmaster forks postgres , can we see that forking process in logfile ? Thanks, Banu.
Re: order of reading the conf files
Thanks a lot Stephen. On Thu, Dec 6, 2018 at 8:53 PM Stephen Frost wrote: > Greetings, > > * Thomas Kellerer (spam_ea...@gmx.net) wrote: > > Stephen Frost schrieb am 06.12.2018 um 15:52: > > > The regular postgresql.conf file is read first, then > > > postgresql.auto.conf and then pg_hba.conf and pg_ident.conf. We can't > > > read pg_hba.conf/pg_ident.conf before reading postgresql.conf and > > > postgresql.auto.conf because their location is specified in > > > postgresql.conf/postgresql.auto.conf. > > > > When are the .conf files read that are included from within > "postgresql.conf"? > > The manual is not clear about that. > > At the time we hit the 'include' line. > > > Are they processed before "postgresql.auto.conf" or after? > > postgresql.auto.conf is always last. > > Thanks! > > Stephen >
Temp tables
Hi, What happens if we create and insert/update the data in TEMP tables , Does that data really gets inserted at disk level or at buffer level and what happens to this data after completion of the transaction ? Thanks Banu
Capacity Planning
Hi Team, What are the check lists for doing capacity planning for a postgresql server ? Regards, BK.
Re: Capacity Planning
Thank-you Allan. On Sun, Jan 20, 2019 at 1:59 PM Ron wrote: > IOW, the same as every other RDBMS... > > On 1/20/19 1:29 AM, Allan Kamau wrote: > > Some of the aspects you may want to consider are: > 1) Per unit time (day or week), how much data will be persisted to the > database, number of records and total size of the data. > 2) How much of these data will be updated and how frequently in a given > time unit (day or week). > 3) Will the databases hosted on the server be for OLTP (OnLine > Transactional Processing) or OLAP (OnLine Analytical Processing) or a > combination of the two. > 4) In case of hardware failure or electrical outage or outage due to other > factors including human activity, how much loss of data (and time) can be > tolerated, see "https://www.postgresql.org/docs/11/backup.html"; and " > https://www.postgresql.org/docs/11/high-availability.html";. > 5) The size of the budget for hardware and manpower. > > Allan. > > On Sat, Jan 19, 2019 at 1:38 PM bhargav kamineni > wrote: > >> Hi Team, >> >> >> What are the check lists for doing capacity planning for a postgresql >> server ? >> >> Regards, >> BK. >> > > -- > Angular momentum makes the world go 'round. >
log_min_duration_statement
Hi Team,, I have set log_min_duration_statement=1000 as my configuration but when i observe log file i could see queries that ran below 1000ms are also getting logged , for instance duration: 0.089 ms duration: 0.175 ms duration: 0.139 ms duration: 0.451 ms duration: 0.136 ms duration: 0.340 ms duration: 0.150 ms duration: 0.440 ms duration: 0.338 ms duration: 0.156 ms tion: 0.396 ms duration: 0.221 ms duration: 0.161 ms * these are my logsettings name | setting -+--- log_autovacuum_min_duration | 0 log_checkpoints | on log_connections | on log_destination | syslog log_directory | log log_disconnections | on log_duration| off log_error_verbosity | default log_executor_stats | off log_file_mode | 0600 log_filename| postgresql-%Y-%m-%d_%H%M%S.log log_hostname| off log_line_prefix | appname=%a,user=%u,db=%d,pid=%p,txid=%x,vtxid=%v log_lock_waits | on log_min_duration_statement | 1000 log_min_error_statement | error log_min_messages| error log_parser_stats| off log_planner_stats | off log_replication_commands| off log_rotation_age| 1440 log_rotation_size | 10240 log_statement | all log_statement_stats | off log_temp_files | 0 log_timezone| Navajo log_truncate_on_rotation| off logging_collector | on max_logical_replication_workers | 4 syslog_facility | local4 syslog_ident| postgres syslog_sequence_numbers | on syslog_split_messages | on wal_log_hints | off Regards, Bhargav K
Re: log_min_duration_statement
> Hi Team,, > > I have set log_min_duration_statement=1000 as my configuration but when > i observe log file i could see queries that ran below 1000ms are also > getting logged , for instance > duration: 0.089 ms > duration: 0.175 ms > duration: 0.139 ms > duration: 0.451 ms > duration: 0.136 ms > duration: 0.340 ms > duration: 0.150 ms > duration: 0.440 ms > duration: 0.338 ms > duration: 0.156 ms > tion: 0.396 ms > duration: 0.221 ms > duration: 0.161 ms > * >The above would not happen to be autovacuum statements? Nope, those are not autovaccum related logs , they are related to queries only. You have log_autovacuum_min_duration = 0 below. On Thu, 24 Jan 2019 at 20:53, Adrian Klaver wrote: > On 1/24/19 6:23 AM, bhargav kamineni wrote: > > Hi Team,, > > > > I have set log_min_duration_statement=1000 as my configuration but when > > i observe log file i could see queries that ran below 1000ms are also > > getting logged , for instance > > duration: 0.089 ms > > duration: 0.175 ms > > duration: 0.139 ms > > duration: 0.451 ms > > duration: 0.136 ms > > duration: 0.340 ms > > duration: 0.150 ms > > duration: 0.440 ms > > duration: 0.338 ms > > duration: 0.156 ms > > tion: 0.396 ms > > duration: 0.221 ms > > duration: 0.161 ms > > * > > The above would not happen to be autovacuum statements? > > You have log_autovacuum_min_duration = 0 below. > > > these are my logsettings > > > >name | setting > > > -+--- > > log_autovacuum_min_duration | 0 > > log_checkpoints | on > > log_connections | on > > log_destination | syslog > > log_directory | log > > log_disconnections | on > > log_duration| off > > log_error_verbosity | default > > log_executor_stats | off > > log_file_mode | 0600 > > log_filename| postgresql-%Y-%m-%d_%H%M%S.log > > log_hostname| off > > log_line_prefix | > > appname=%a,user=%u,db=%d,pid=%p,txid=%x,vtxid=%v > > log_lock_waits | on > > log_min_duration_statement | 1000 > > log_min_error_statement | error > > log_min_messages| error > > log_parser_stats| off > > log_planner_stats | off > > log_replication_commands| off > > log_rotation_age| 1440 > > log_rotation_size | 10240 > > log_statement | all > > log_statement_stats | off > > log_temp_files | 0 > > log_timezone| Navajo > > log_truncate_on_rotation| off > > logging_collector | on > > max_logical_replication_workers | 4 > > syslog_facility | local4 > > syslog_ident| postgres > > syslog_sequence_numbers | on > > syslog_split_messages | on > > wal_log_hints | off > > > > Regards, > > Bhargav K > > > > > > > -- > Adrian Klaver > adrian.kla...@aklaver.com >