pg_upgrade vs. logical replication
We maintain c.50 logical replicas. Typically the producer version is 12 or 13, and the subscriber version is 14. We intend to upgrade the subscribers to 15 using pg_upgrade. However, we ran into an unexpected problem with that approach. I couldn't find much being mentioned about it on the web, so I'm sending this message for a sanity check that what we are seeing is expected, and, that the proposed solution make sense. The problem we see is that after running pg_upgrade on the subscriber, details about the subscription are lost; specifically, pg_subscription_rel is empty, and the associated replication origin has no LSN information in pg_replication_origin_status. I found a thread on the hackers list that seems to call out this problem: https://www.postgresql.org/message-id/flat/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud It is a very long thread. TLDR; Julien describes a similar problem to what we have (although in our case, it is a "pure" logical replica with no consideration about switching between physical replication, or needing to cut over, etc), and a fix has gone in to PG17 that sorts this problem. However, we can't go to 17 yet, so need a solution for 15 and 16. We are doing the following sequence of events: (1) Shutdown subscriber; (2) Startup subscriber with max_logical_replication_workers set to 0; (3) Copy the data in pg_replication_origin_status to a table; (4) Shutdown subscriber; (5) Run pg_upgrade; (6) Startup subscriber with max_logical_replication_workers set to 0; (7) Apply the LSNs recorded in (3) to the appropriate origin using pg_replication_origin_advance; (8) ENABLE, REFRESH PUBLICATION WITH (COPY_DATA=FALSE), & DISABLE the subscription; (9) Shutdown subscriber; (10) Startup subscriber normally; (11) ENABLE subscription. Step 7 requires a mapping from old origin name to new origin name as the names are derived from subscription OIDs. Steps 2 and 6 are required to ensure no activity occurs that could advance LSNs. Unlike in the mentioned thread, we know we won't have to deal with tables that are partially replicated i.e we know all tables will be 'r'. We have done some testing with this and it seems to work. We run a repeated INSERT on the producer several times a second, and notice that after step 11, we have all expected rows arrive on the subscriber without gaps. However, I thought I'd ping this list for a sanity check that what we are doing makes sense. And, if so, perhaps it'll be useful to other people in a similar situation. Comments / suggestions welcome. -Joe
Re: pg_upgrade vs. logical replication
On Mon, Dec 9, 2024 at 6:43 AM Joe Wildish wrote: Overall, your solution seems okay, but: > a fix has gone in to PG17 that sorts this problem. > > However, we can't go to 17 yet, so need a solution for 15 and 16. Honestly, this would seem like a really, really strong reason to push for v17. Cheers, Greg
Re: Empty query_id in pg_stat_activity
On Fri, Dec 06, 2024 at 04:44:29PM +0100, Erik Wienhold wrote: > Another possibility is that the session just disabled compute_query_id: > https://postgr.es/m/472115375.225506.1683812791906%40office.mailbox.org Or possibly this uses a path where we're not aggressive enough the query ID while we should know it (there are slight cases where we could finish without one, like some PL contexts). If you could provide more details for the reproduction of the problem that does not involve benchbase, that would save time. -- Michael signature.asc Description: PGP signature
Cancelled query due to buffer deadlock with recovery
Hello! I am running Postgres 16.6 on a primary and several standby servers. I recently observed 3 queries being cancelled with this error: ERROR: canceling statement due to conflict with recovery (SQLSTATE 40P01) There was this detail with the error: User transaction caused buffer deadlock with recovery The standbys have hot_standby_feedback=on and max_standby_streaming_delay=1min. As far as I can tell, there were no long running transactions at the time on either the primary or the standby in question. I did not see any replication delay according to monitoring. The queries that were cancelled were reading catalog tables - pg_inherits joined with pg_class. I typically expect them to complete quickly (they have a 100ms statement timeout). I understand that canceling due to conflicts should be expected when using hot standbys, but I'm struggling to understand why it happened in this case. What do you think? Thank you! Will
RE: tds_fdw DB-Library error: DB #: 20002, DB Msg: Adaptive Server connection failed - RESOLVED
Thank you again for your help on this issue. After looking into the lack of TLS 1.2 support, we were able to figure out what was going on . I ran the tsql with TDSDUMP as recommended: TDSDUMP=stdout tsql -H mysqlservername.domain.net -p 1477 -U 'Someusername' -P 'xx' And received useful information, especially: tls.c:567:handshake failed: One of the involved algorithms has insufficient security level. login.c:670:login packet rejected Error 20002 (severity 9): Adaptive Server connection failed util.c:363:tdserror: client library returned TDS_INT_CANCEL(2) util.c:386:tdserror: returning TDS_INT_CANCEL(2) mem.c:665:tds_free_all_results() There was a problem connecting to the server Per my sysadmin " The root of the cause is that RedHat9 and its derivatives have permanently retired older non-secure cryptographic policies." Basically we had to set tds version to 7.0 for the older crypto policies and that specific server -- in the /etc/freetds.conf file e.g. # rocky 9. [mysqlservername.domain.net] host = mysqlservername.domain.net port = 1477 tds version = 7.0 and obviously the better solution is to upgrade the SQL Server to a newer version. Thank you, Stacy -Original Message- From: Zacher, Stacy Sent: Friday, December 6, 2024 4:11 PM To: Adrian Klaver ; pgsql-general@lists.postgresql.org Subject: RE: tds_fdw DB-Library error: DB #: 20002, DB Msg: Adaptive Server connection failed Hi Adrian: Please see my replies below:** Thank you! Stacy -Original Message- From: Adrian Klaver Sent: Friday, December 6, 2024 3:51 PM To: Zacher, Stacy ; pgsql-general@lists.postgresql.org Subject: Re: tds_fdw DB-Library error: DB #: 20002, DB Msg: Adaptive Server connection failed ATTENTION: This email originated from a sender outside of MCW. Use caution when clicking on links or opening attachments. On 12/6/24 13:19, Zacher, Stacy wrote: > Hello: > > We recently did an OS upgrade from Rocky Linux 8 to 9 on a Postgres > v13 server that has 2 Foreign data wrappers that use tds_fdw for > connections to 2 different SQL Servers. > > After the OS upgrade, one of the FDWs stopped working for some reason > but the other one still works. > > The error message when trying to run the import statement in psql is > as follows (I changed some of the names): > > using psql: > > I tested connecting to the server/port via ncat and it returns > successfully. > > When I try sqlcmd or tsql, The 1477 port is static (vs. dynamic) > > [postgres@rh9server bin]$ ./sqlcmd -S 141.xxx.x.xxx,1477 -C -d > Pdata_db -U PData_Reader -P xx > > Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : TCP Provider: > Error code 0x2746. Some searching mentions that this 'TCP Provider: Error code 0x2746.' can be related to lack of TLS 1.2 support. I don't see your tsql connection attempt. ** Here is the tsql attempt: tsql -H mysqlserver.somedomain.net -p 1477 -U 'PData_Reader' -P 'xx' locale is "en_US.UTF-8" locale charset is "UTF-8" using default charset "UTF-8" Error 20002 (severity 9): Adaptive Server connection failed Error 20002 (severity 9): Adaptive Server connection failed There was a problem connecting to the server > The FDW that is still working is going to a SQL Server v14.x 2017 > server and the broken one is going to a SQL Server version 13 2016 server. Are the SQL Server instances running on the same machine? ** No, they are two different servers. If not what versions of Windows are they using in each case? ** The broken FDW to the SQL server 2016 is running on Windows 2012 R2 ** The one that works is running on is running SQL Server 2017 on Windows Server 2016 Datacenter > > Also, I recreated the FDW that is broken on another server (Postgres > 16 on Rocky Linux 8) and it works there. > > So I'm puzzled here as to why it's not working on the upgraded server, > yet the other FDW on the same server still works. > > Any help on this issue would be greatly appreciated. > > Thank you, > > Stacy > > szac...@mcw.edu > -- Adrian Klaver adrian.kla...@aklaver.com
Re: Errors when restoring backup created by pg_dumpall
On 12/7/24 11:58, David G. Johnston wrote: On Sat, Dec 7, 2024 at 12:25 PM PopeRigby wrote: It actually looks like setting those all to have public fixed all the errors, including the one with lldap. So, how can I get it to not put public there automatically for next time? I assume you mean "get it to put public there" (i.e., the "not" is a typo) You cannot. The security team has decided to not permit an opt-in bypass of the lock-downs implemented to fix CVE-2018-1058. Your only real choice at the moment is to replace the function call in the generated expression with a custom function and in that custom function's create function command attach a "set search_path to public" clause. That will prevent inlining and also ensure the public schema is in the search_path when executing the public.ll_to_earth function call. With that in place the empty search_path in the dump file will no longer matter. David J. Yeah, that was a typo. It seems weird that this behavior would be broken by default though, is there anything that could fix it upstream?
Re: Errors when restoring backup created by pg_dumpall
On Mon, Dec 9, 2024 at 3:14 PM PopeRigby wrote: > On 12/7/24 11:58, David G. Johnston wrote: > > On Sat, Dec 7, 2024 at 12:25 PM PopeRigby wrote: > >> >> It actually looks like setting those all to have public fixed all the >> errors, including the one with lldap. So, how can I get it to not put >> public there automatically for next time? >> >> > I assume you mean "get it to put public there" (i.e., the "not" is a typo) > > You cannot. The security team has decided to not permit an opt-in bypass > of the lock-downs implemented to fix CVE-2018-1058. > > Your only real choice at the moment is to replace the function call in the > generated expression with a custom function and in that custom function's > create function command attach a "set search_path to public" clause. That > will prevent inlining and also ensure the public schema is in the > search_path when executing the public.ll_to_earth function call. With that > in place the empty search_path in the dump file will no longer matter. > > Yeah, that was a typo. It seems weird that this behavior would be broken > by default though, is there anything that could fix it upstream? > You saw and tried the work being done "upstream" to fix the situation. It's a big knot in the system and it isn't easy (or highly motivated) to untangle unfortunately... David J.
Re: Errors when restoring backup created by pg_dumpall
Adrian Klaver writes: > You could file an issue here: > https://www.postgresql.org/account/login/?next=/account/submitbug/ > Ask if the developers could use the mechanisms available here: > https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION This wouldn't really move the needle, since (a) there is not some magic group of people that will become involved who aren't already and (b) the requested fix seems exactly the same as the work already in progress at [1]. What would help is to figure out why the proposed patch didn't seem to work for you. I continue to suspect that you didn't really install the updated extension, but it's unclear. regards, tom lane [1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 15:23, Tom Lane wrote: Adrian Klaver writes: You could file an issue here: https://www.postgresql.org/account/login/?next=/account/submitbug/ Ask if the developers could use the mechanisms available here: https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION This wouldn't really move the needle, since (a) there is not some magic group of people that will become involved who aren't already and (b) the requested fix seems exactly the same as the work already in progress at [1]. What would help is to figure out why the proposed patch didn't seem to work for you. I continue to suspect that you didn't really install the updated extension, but it's unclear. regards, tom lane [1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us I compiled and installed PostgreSQL with the patch, and restarted it, so unless there are other steps I'm not sure what I could have done wrong.
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 20:54, Ron Johnson wrote: On Mon, Dec 9, 2024 at 11:24 PM David G. Johnston mailto:david.g.johns...@gmail.com>> wrote: On Monday, December 9, 2024, Ron Johnson mailto:ronljohnso...@gmail.com>> wrote: On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver mailto:adrian.kla...@aklaver.com>> wrote: [snip] In future schema qualify all references. For now in the dump file you could search for SELECT pg_catalog.set_config('search_path', '', false); and set to SELECT pg_catalog.set_config('search_path', 'public', false); What if this had been a pg_dump --format={custom,directory} backup? pg_restore has a mode where it can dump out SQL to a script instead of directly restoring to the database. That Would Be Very, Very Bad if this were a database big enough to have required a multi-threaded dump. From the original post: https://www.postgresql.org/message-id/6a6439f1-8039-44e2-8fb9-59028f7f2014%40mailbox.org "My HDD recently failed so I'm trying to restore my backup, but I'm running into some errors." There really was no choice. Not tested but in the pg_restore case I could see at least trying: 1) pg_restore -s -f schema_definitions.sql custom_format_file 2) Then making the change in the search_path in schema_definitions.sql and then load the schema in the database using psql 3) Then pg_restore -a -j custom_format_file. -- Death to , and butter sauce. Don't boil me, I'm still alive. lobster! -- Adrian Klaver adrian.kla...@aklaver.com
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 15:30, PopeRigby wrote: On 12/9/24 15:23, Tom Lane wrote: Adrian Klaver writes: You could file an issue here: https://www.postgresql.org/account/login/?next=/account/submitbug/ Ask if the developers could use the mechanisms available here: https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION This wouldn't really move the needle, since (a) there is not some magic group of people that will become involved who aren't already and (b) the requested fix seems exactly the same as the work already in progress at [1]. What would help is to figure out why the proposed patch didn't seem to work for you. I continue to suspect that you didn't really install the updated extension, but it's unclear. regards, tom lane [1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us I compiled and installed PostgreSQL with the patch, and restarted it, so unless there are other steps I'm not sure what I could have done wrong. What was the source of the patch? -- Adrian Klaver adrian.kla...@aklaver.com
Re: Errors when restoring backup created by pg_dumpall
On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver wrote: [snip] > In future schema qualify all references. > > For now in the dump file you could search for > > SELECT pg_catalog.set_config('search_path', '', false); > > and set to > > SELECT pg_catalog.set_config('search_path', 'public', false); > What if this had been a pg_dump --format={custom,directory} backup? -- Death to , and butter sauce. Don't boil me, I'm still alive. lobster!
Re: Errors when restoring backup created by pg_dumpall
On Monday, December 9, 2024, Ron Johnson wrote: > On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver > wrote: > [snip] > >> In future schema qualify all references. >> >> For now in the dump file you could search for >> >> SELECT pg_catalog.set_config('search_path', '', false); >> >> and set to >> >> SELECT pg_catalog.set_config('search_path', 'public', false); >> > > What if this had been a pg_dump --format={custom,directory} backup? > pg_restore has a mode where it can dump out SQL to a script instead of directly restoring to the database. David J.
Re: Errors when restoring backup created by pg_dumpall
On Mon, Dec 9, 2024 at 11:24 PM David G. Johnston < david.g.johns...@gmail.com> wrote: > On Monday, December 9, 2024, Ron Johnson wrote: > >> On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver >> wrote: >> [snip] >> >>> In future schema qualify all references. >>> >>> For now in the dump file you could search for >>> >>> SELECT pg_catalog.set_config('search_path', '', false); >>> >>> and set to >>> >>> SELECT pg_catalog.set_config('search_path', 'public', false); >>> >> >> What if this had been a pg_dump --format={custom,directory} backup? >> > > pg_restore has a mode where it can dump out SQL to a script instead of > directly restoring to the database. > That Would Be Very, Very Bad if this were a database big enough to have required a multi-threaded dump. -- Death to , and butter sauce. Don't boil me, I'm still alive. lobster!
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 15:23, Tom Lane wrote: Adrian Klaver writes: You could file an issue here: https://www.postgresql.org/account/login/?next=/account/submitbug/ Ask if the developers could use the mechanisms available here: https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION This wouldn't really move the needle, since (a) there is not some magic group of people that will become involved who aren't already and (b) the requested fix seems exactly the same as the work already in progress at [1]. I made the suggestion to the OP, my mistake. Somewhere I missed that work was being done on this. What would help is to figure out why the proposed patch didn't seem to work for you. I continue to suspect that you didn't really install the updated extension, but it's unclea Are you referring to this?: https://www.postgresql.org/message-id/3316564.aeNJFYEL58@aivenlaptop regards, tom lane [1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us -- Adrian Klaver adrian.kla...@aklaver.com
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 14:47, Adrian Klaver wrote: On 12/9/24 14:14, PopeRigby wrote: On 12/7/24 11:58, David G. Johnston wrote: On Sat, Dec 7, 2024 at 12:25 PM PopeRigby wrote: It actually looks like setting those all to have public fixed all the errors, including the one with lldap. So, how can I get it to not put public there automatically for next time? I assume you mean "get it to put public there" (i.e., the "not" is a typo) You cannot. The security team has decided to not permit an opt-in bypass of the lock-downs implemented to fix CVE-2018-1058. Your only real choice at the moment is to replace the function call in the generated expression with a custom function and in that custom function's create function command attach a "set search_path to public" clause. That will prevent inlining and also ensure the public schema is in the search_path when executing the public.ll_to_earth function call. With that in place the empty search_path in the dump file will no longer matter. David J. Yeah, that was a typo. It seems weird that this behavior would be broken by default though, is there anything that could fix it upstream? You could file an issue here: https://www.postgresql.org/account/login/?next=/account/submitbug/ Ask if the developers could use the mechanisms available here: https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION to schema qualify the objects in the extension. Not sure if that will fly or not, but it is worth a shot. Will do!
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 14:31, David G. Johnston wrote: On Mon, Dec 9, 2024 at 3:14 PM PopeRigby wrote: On 12/7/24 11:58, David G. Johnston wrote: On Sat, Dec 7, 2024 at 12:25 PM PopeRigby wrote: It actually looks like setting those all to have public fixed all the errors, including the one with lldap. So, how can I get it to not put public there automatically for next time? I assume you mean "get it to put public there" (i.e., the "not" is a typo) You cannot. The security team has decided to not permit an opt-in bypass of the lock-downs implemented to fix CVE-2018-1058. Your only real choice at the moment is to replace the function call in the generated expression with a custom function and in that custom function's create function command attach a "set search_path to public" clause. That will prevent inlining and also ensure the public schema is in the search_path when executing the public.ll_to_earth function call. With that in place the empty search_path in the dump file will no longer matter. Yeah, that was a typo. It seems weird that this behavior would be broken by default though, is there anything that could fix it upstream? You saw and tried the work being done "upstream" to fix the situation. It's a big knot in the system and it isn't easy (or highly motivated) to untangle unfortunately... David J. Understood. Well, at least it was a fairly easy fix. Thanks for the help :)
Re: Errors when restoring backup created by pg_dumpall
On 12/9/24 14:14, PopeRigby wrote: On 12/7/24 11:58, David G. Johnston wrote: On Sat, Dec 7, 2024 at 12:25 PM PopeRigby wrote: It actually looks like setting those all to have public fixed all the errors, including the one with lldap. So, how can I get it to not put public there automatically for next time? I assume you mean "get it to put public there" (i.e., the "not" is a typo) You cannot. The security team has decided to not permit an opt-in bypass of the lock-downs implemented to fix CVE-2018-1058. Your only real choice at the moment is to replace the function call in the generated expression with a custom function and in that custom function's create function command attach a "set search_path to public" clause. That will prevent inlining and also ensure the public schema is in the search_path when executing the public.ll_to_earth function call. With that in place the empty search_path in the dump file will no longer matter. David J. Yeah, that was a typo. It seems weird that this behavior would be broken by default though, is there anything that could fix it upstream? You could file an issue here: https://www.postgresql.org/account/login/?next=/account/submitbug/ Ask if the developers could use the mechanisms available here: https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION to schema qualify the objects in the extension. Not sure if that will fly or not, but it is worth a shot. -- Adrian Klaver adrian.kla...@aklaver.com