pg_upgrade vs. logical replication

2024-12-09 Thread Joe Wildish
We maintain c.50 logical replicas.  Typically the producer version is 12 or 13,
and the subscriber version is 14.  We intend to upgrade the subscribers to 15
using pg_upgrade.  However, we ran into an unexpected problem with that
approach.  I couldn't find much being mentioned about it on the web, so I'm
sending this message for a sanity check that what we are seeing is expected,
and, that the proposed solution make sense.

The problem we see is that after running pg_upgrade on the subscriber, details
about the subscription are lost; specifically, pg_subscription_rel is empty, and
the associated replication origin has no LSN information in
pg_replication_origin_status.

I found a thread on the hackers list that seems to call out this problem:

https://www.postgresql.org/message-id/flat/20230217075433.u5mjly4d5cr4hcfe%40jrouhaud

It is a very long thread.  TLDR; Julien describes a similar problem to what we
have (although in our case, it is a "pure" logical replica with no consideration
about switching between physical replication, or needing to cut over, etc), and
a fix has gone in to PG17 that sorts this problem.

However, we can't go to 17 yet, so need a solution for 15 and 16. We are doing
the following sequence of events:

(1)  Shutdown subscriber;
(2)  Startup subscriber with max_logical_replication_workers set to 0;
(3)  Copy the data in pg_replication_origin_status to a table;
(4)  Shutdown subscriber;
(5)  Run pg_upgrade;
(6)  Startup subscriber with max_logical_replication_workers set to 0;
(7)  Apply the LSNs recorded in (3) to the appropriate origin using
 pg_replication_origin_advance;
(8)  ENABLE, REFRESH PUBLICATION WITH (COPY_DATA=FALSE), & DISABLE the
 subscription;
(9)  Shutdown subscriber;
(10) Startup subscriber normally;
(11) ENABLE subscription.

Step 7 requires a mapping from old origin name to new origin name as the names
are derived from subscription OIDs.  Steps 2 and 6 are required to ensure no
activity occurs that could advance LSNs.

Unlike in the mentioned thread, we know we won't have to deal with tables that
are partially replicated i.e we know all tables will be 'r'.

We have done some testing with this and it seems to work.  We run a repeated
INSERT on the producer several times a second, and notice that after step 11,
we have all expected rows arrive on the subscriber without gaps.  However, I
thought I'd ping this list for a sanity check that what we are doing makes
sense.  And, if so, perhaps it'll be useful to other people in a similar
situation.

Comments / suggestions welcome.

-Joe




Re: pg_upgrade vs. logical replication

2024-12-09 Thread Greg Sabino Mullane
On Mon, Dec 9, 2024 at 6:43 AM Joe Wildish  wrote:
Overall, your solution seems okay, but:


> a fix has gone in to PG17 that sorts this problem.
>


> However, we can't go to 17 yet, so need a solution for 15 and 16.


Honestly, this would seem like a really, really strong reason to push for
v17.

Cheers,
Greg


Re: Empty query_id in pg_stat_activity

2024-12-09 Thread Michael Paquier
On Fri, Dec 06, 2024 at 04:44:29PM +0100, Erik Wienhold wrote:
> Another possibility is that the session just disabled compute_query_id:
> https://postgr.es/m/472115375.225506.1683812791906%40office.mailbox.org

Or possibly this uses a path where we're not aggressive enough the
query ID while we should know it (there are slight cases where we
could finish without one, like some PL contexts).

If you could provide more details for the reproduction of the problem
that does not involve benchbase, that would save time.
--
Michael


signature.asc
Description: PGP signature


Cancelled query due to buffer deadlock with recovery

2024-12-09 Thread Will Storey
Hello!

I am running Postgres 16.6 on a primary and several standby servers. I
recently observed 3 queries being cancelled with this error:

ERROR: canceling statement due to conflict with recovery (SQLSTATE 40P01)

There was this detail with the error:

User transaction caused buffer deadlock with recovery

The standbys have hot_standby_feedback=on and
max_standby_streaming_delay=1min.

As far as I can tell, there were no long running transactions at the time
on either the primary or the standby in question. I did not see any
replication delay according to monitoring.

The queries that were cancelled were reading catalog tables - pg_inherits
joined with pg_class. I typically expect them to complete quickly (they
have a 100ms statement timeout).

I understand that canceling due to conflicts should be expected when using
hot standbys, but I'm struggling to understand why it happened in this
case.

What do you think?

Thank you!

Will




RE: tds_fdw DB-Library error: DB #: 20002, DB Msg: Adaptive Server connection failed - RESOLVED

2024-12-09 Thread Zacher, Stacy
Thank you again for your help on this issue. 

After looking into the lack of TLS 1.2 support, we were able to figure out what 
was going on .   
I ran  the tsql with TDSDUMP as recommended: 

TDSDUMP=stdout tsql -H mysqlservername.domain.net -p 1477 -U 'Someusername' -P 
'xx'
And received useful information, especially:

tls.c:567:handshake failed: One of the involved algorithms has insufficient 
security level.
login.c:670:login packet rejected
Error 20002 (severity 9):
Adaptive Server connection failed
util.c:363:tdserror: client library returned TDS_INT_CANCEL(2)
util.c:386:tdserror: returning TDS_INT_CANCEL(2)
mem.c:665:tds_free_all_results()
There was a problem connecting to the server

Per my sysadmin " The root of the cause is that RedHat9 and its derivatives 
have permanently retired older non-secure cryptographic policies."
Basically we had to set tds version to 7.0 for the older crypto policies and 
that specific server  -- in the /etc/freetds.conf file
e.g. 
# rocky 9.
[mysqlservername.domain.net]
host = mysqlservername.domain.net
port = 1477
tds version = 7.0

and obviously the better solution is to upgrade the SQL Server to a newer 
version. 
Thank you, 
Stacy 




-Original Message-
From: Zacher, Stacy 
Sent: Friday, December 6, 2024 4:11 PM
To: Adrian Klaver ; 
pgsql-general@lists.postgresql.org
Subject: RE: tds_fdw DB-Library error: DB #: 20002, DB Msg: Adaptive Server 
connection failed

Hi Adrian:

Please see my replies below:** 

Thank you!
Stacy
-Original Message-
From: Adrian Klaver 
Sent: Friday, December 6, 2024 3:51 PM
To: Zacher, Stacy ; pgsql-general@lists.postgresql.org
Subject: Re: tds_fdw DB-Library error: DB #: 20002, DB Msg: Adaptive Server 
connection failed

ATTENTION: This email originated from a sender outside of MCW. Use caution when 
clicking on links or opening attachments.


On 12/6/24 13:19, Zacher, Stacy wrote:
> Hello:
>
> We recently did an OS upgrade from Rocky Linux 8 to 9 on a Postgres
> v13 server that has 2 Foreign data wrappers that use tds_fdw for 
> connections to 2 different SQL Servers.
>
> After the OS upgrade, one of the FDWs stopped working for some reason 
> but the other one still works.
>
> The error message when trying to run the import statement in psql is 
> as follows (I changed some of the names):
>
> using psql:
>

> I tested connecting to the server/port via ncat and it returns 
> successfully.
>
> When I try sqlcmd or tsql, The 1477 port is static (vs. dynamic)
>
> [postgres@rh9server bin]$ ./sqlcmd -S 141.xxx.x.xxx,1477 -C -d 
> Pdata_db -U PData_Reader -P xx
>
> Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : TCP Provider:
> Error code 0x2746.

Some searching mentions that this 'TCP Provider: Error code 0x2746.' can be 
related to lack of TLS 1.2 support.

I don't see your tsql connection attempt.

** Here is the tsql attempt:   
tsql -H mysqlserver.somedomain.net -p 1477 -U 'PData_Reader' -P 'xx'
locale is "en_US.UTF-8"
locale charset is "UTF-8"
using default charset "UTF-8"
Error 20002 (severity 9):
Adaptive Server connection failed Error 20002 (severity 9):
Adaptive Server connection failed There was a problem connecting to the 
server

> The FDW that is still working is going to a SQL Server v14.x 2017 
> server and the broken one is going to a SQL Server version 13 2016 server.

Are the SQL Server instances running on the same machine?
** No, they are two different servers.  

If not what versions of Windows are they using in each case?
** The broken FDW to the SQL server 2016 is running on Windows 2012 R2
** The one that works is running on is running SQL Server 2017 on Windows 
Server 2016 Datacenter

>
> Also, I recreated the FDW that is broken on another server (Postgres
> 16 on Rocky Linux 8) and it works there.
>
> So I'm puzzled here as to why it's not working on the upgraded server, 
> yet the other FDW on the same server still works.
>
> Any help on this issue would be greatly appreciated.
>
> Thank you,
>
> Stacy
>
> szac...@mcw.edu
>

--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread PopeRigby

On 12/7/24 11:58, David G. Johnston wrote:

On Sat, Dec 7, 2024 at 12:25 PM PopeRigby  wrote:


It actually looks like setting those all to have public fixed all the
errors, including the one with lldap. So, how can I get it to not put
public there automatically for next time?


I assume you mean "get it to put public there" (i.e., the "not" is a typo)

You cannot. The security team has decided to not permit an opt-in 
bypass of the lock-downs implemented to fix CVE-2018-1058.


Your only real choice at the moment is to replace the function call in 
the generated expression with a custom function and in that custom 
function's create function command attach a "set search_path to 
public" clause.  That will prevent inlining and also ensure the public 
schema is in the search_path when executing the public.ll_to_earth 
function call.  With that in place the empty search_path in the dump 
file will no longer matter.


David J.

Yeah, that was a typo. It seems weird that this behavior would be broken 
by default though, is there anything that could fix it upstream?


Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread David G. Johnston
On Mon, Dec 9, 2024 at 3:14 PM PopeRigby  wrote:

> On 12/7/24 11:58, David G. Johnston wrote:
>
> On Sat, Dec 7, 2024 at 12:25 PM PopeRigby  wrote:
>
>>
>> It actually looks like setting those all to have public fixed all the
>> errors, including the one with lldap. So, how can I get it to not put
>> public there automatically for next time?
>>
>>
> I assume you mean "get it to put public there" (i.e., the "not" is a typo)
>
> You cannot.  The security team has decided to not permit an opt-in bypass
> of the lock-downs implemented to fix CVE-2018-1058.
>
> Your only real choice at the moment is to replace the function call in the
> generated expression with a custom function and in that custom function's
> create function command attach a "set search_path to public" clause.  That
> will prevent inlining and also ensure the public schema is in the
> search_path when executing the public.ll_to_earth function call.  With that
> in place the empty search_path in the dump file will no longer matter.
>
> Yeah, that was a typo. It seems weird that this behavior would be broken
> by default though, is there anything that could fix it upstream?
>

You saw and tried the work being done "upstream" to fix the situation.
It's a big knot in the system and it isn't easy (or highly motivated) to
untangle unfortunately...

David J.


Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Tom Lane
Adrian Klaver  writes:
> You could file an issue here:
> https://www.postgresql.org/account/login/?next=/account/submitbug/
> Ask if the developers could use the mechanisms available here:
> https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION

This wouldn't really move the needle, since (a) there is not some
magic group of people that will become involved who aren't already
and (b) the requested fix seems exactly the same as the work
already in progress at [1].

What would help is to figure out why the proposed patch didn't
seem to work for you.  I continue to suspect that you didn't
really install the updated extension, but it's unclear.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us




Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread PopeRigby

On 12/9/24 15:23, Tom Lane wrote:

Adrian Klaver  writes:

You could file an issue here:
https://www.postgresql.org/account/login/?next=/account/submitbug/
Ask if the developers could use the mechanisms available here:
https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION

This wouldn't really move the needle, since (a) there is not some
magic group of people that will become involved who aren't already
and (b) the requested fix seems exactly the same as the work
already in progress at [1].

What would help is to figure out why the proposed patch didn't
seem to work for you.  I continue to suspect that you didn't
really install the updated extension, but it's unclear.

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us


I compiled and installed PostgreSQL with the patch, and restarted it, so 
unless there are other steps I'm not sure what I could have done wrong.






Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Adrian Klaver

On 12/9/24 20:54, Ron Johnson wrote:
On Mon, Dec 9, 2024 at 11:24 PM David G. Johnston 
mailto:david.g.johns...@gmail.com>> wrote:


On Monday, December 9, 2024, Ron Johnson mailto:ronljohnso...@gmail.com>> wrote:

On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver
mailto:adrian.kla...@aklaver.com>>
wrote:
[snip]

In future schema qualify all references.

For now in the dump file you could search for

SELECT pg_catalog.set_config('search_path', '', false);

and set to

SELECT pg_catalog.set_config('search_path', 'public', false);


What if this had been a pg_dump --format={custom,directory} backup?


pg_restore has a mode where it can dump out SQL to a script instead
of directly restoring to the database.


That Would Be Very, Very Bad if this were a database big enough to have 
required a multi-threaded dump.


From the original post:

https://www.postgresql.org/message-id/6a6439f1-8039-44e2-8fb9-59028f7f2014%40mailbox.org

"My HDD recently failed so I'm trying to restore my backup, but I'm
running into some errors."

There really was no choice.

Not tested but in the pg_restore case I could see at least trying:

1) pg_restore -s  -f schema_definitions.sql  custom_format_file

2) Then making the change in the search_path in schema_definitions.sql 
and then load the schema in the database using psql


3) Then pg_restore -a -j  custom_format_file.



--
Death to , and butter sauce.
Don't boil me, I'm still alive.
 lobster!


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Adrian Klaver

On 12/9/24 15:30, PopeRigby wrote:

On 12/9/24 15:23, Tom Lane wrote:

Adrian Klaver  writes:

You could file an issue here:
https://www.postgresql.org/account/login/?next=/account/submitbug/
Ask if the developers could use the mechanisms available here:
https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION

This wouldn't really move the needle, since (a) there is not some
magic group of people that will become involved who aren't already
and (b) the requested fix seems exactly the same as the work
already in progress at [1].

What would help is to figure out why the proposed patch didn't
seem to work for you.  I continue to suspect that you didn't
really install the updated extension, but it's unclear.

    regards, tom lane

[1] 
https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us


I compiled and installed PostgreSQL with the patch, and restarted it, so 
unless there are other steps I'm not sure what I could have done wrong.





What was the source of the patch?

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Ron Johnson
On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver 
wrote:
[snip]

> In future schema qualify all references.
>
> For now in the dump file you could search for
>
> SELECT pg_catalog.set_config('search_path', '', false);
>
> and set to
>
> SELECT pg_catalog.set_config('search_path', 'public', false);
>

What if this had been a pg_dump --format={custom,directory} backup?

-- 
Death to , and butter sauce.
Don't boil me, I'm still alive.
 lobster!


Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread David G. Johnston
On Monday, December 9, 2024, Ron Johnson  wrote:

> On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver 
> wrote:
> [snip]
>
>> In future schema qualify all references.
>>
>> For now in the dump file you could search for
>>
>> SELECT pg_catalog.set_config('search_path', '', false);
>>
>> and set to
>>
>> SELECT pg_catalog.set_config('search_path', 'public', false);
>>
>
> What if this had been a pg_dump --format={custom,directory} backup?
>

pg_restore has a mode where it can dump out SQL to a script instead of
directly restoring to the database.

David J.


Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Ron Johnson
On Mon, Dec 9, 2024 at 11:24 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Monday, December 9, 2024, Ron Johnson  wrote:
>
>> On Sat, Nov 30, 2024 at 10:36 PM Adrian Klaver 
>> wrote:
>> [snip]
>>
>>> In future schema qualify all references.
>>>
>>> For now in the dump file you could search for
>>>
>>> SELECT pg_catalog.set_config('search_path', '', false);
>>>
>>> and set to
>>>
>>> SELECT pg_catalog.set_config('search_path', 'public', false);
>>>
>>
>> What if this had been a pg_dump --format={custom,directory} backup?
>>
>
> pg_restore has a mode where it can dump out SQL to a script instead of
> directly restoring to the database.
>

That Would Be Very, Very Bad if this were a database big enough to have
required a multi-threaded dump.

-- 
Death to , and butter sauce.
Don't boil me, I'm still alive.
 lobster!


Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Adrian Klaver

On 12/9/24 15:23, Tom Lane wrote:

Adrian Klaver  writes:

You could file an issue here:
https://www.postgresql.org/account/login/?next=/account/submitbug/
Ask if the developers could use the mechanisms available here:
https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION


This wouldn't really move the needle, since (a) there is not some
magic group of people that will become involved who aren't already
and (b) the requested fix seems exactly the same as the work
already in progress at [1].


I made the suggestion to the OP, my mistake. Somewhere I missed that 
work was being done on this.




What would help is to figure out why the proposed patch didn't
seem to work for you.  I continue to suspect that you didn't
really install the updated extension, but it's unclea


Are you referring to this?:

https://www.postgresql.org/message-id/3316564.aeNJFYEL58@aivenlaptop



regards, tom lane

[1] https://www.postgresql.org/message-id/flat/3395418.1618352...@sss.pgh.pa.us


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread PopeRigby

On 12/9/24 14:47, Adrian Klaver wrote:

On 12/9/24 14:14, PopeRigby wrote:

On 12/7/24 11:58, David G. Johnston wrote:
On Sat, Dec 7, 2024 at 12:25 PM PopeRigby  
wrote:



    It actually looks like setting those all to have public fixed 
all the
    errors, including the one with lldap. So, how can I get it to 
not put

    public there automatically for next time?


I assume you mean "get it to put public there" (i.e., the "not" is a 
typo)


You cannot. The security team has decided to not permit an opt-in 
bypass of the lock-downs implemented to fix CVE-2018-1058.


Your only real choice at the moment is to replace the function call 
in the generated expression with a custom function and in that 
custom function's create function command attach a "set search_path 
to public" clause.  That will prevent inlining and also ensure the 
public schema is in the search_path when executing the 
public.ll_to_earth function call.  With that in place the empty 
search_path in the dump file will no longer matter.


David J.

Yeah, that was a typo. It seems weird that this behavior would be 
broken by default though, is there anything that could fix it upstream?




You could file an issue here:

https://www.postgresql.org/account/login/?next=/account/submitbug/

Ask if the developers could use the mechanisms available here:

https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION 



to schema qualify the objects in the extension.

Not sure if that will fly or not, but it is worth a shot.


Will do!





Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread PopeRigby

On 12/9/24 14:31, David G. Johnston wrote:

On Mon, Dec 9, 2024 at 3:14 PM PopeRigby  wrote:

On 12/7/24 11:58, David G. Johnston wrote:

On Sat, Dec 7, 2024 at 12:25 PM PopeRigby 
wrote:


It actually looks like setting those all to have public fixed
all the
errors, including the one with lldap. So, how can I get it to
not put
public there automatically for next time?


I assume you mean "get it to put public there" (i.e., the "not"
is a typo)

You cannot.  The security team has decided to not permit an
opt-in bypass of the lock-downs implemented to fix CVE-2018-1058.

Your only real choice at the moment is to replace the function
call in the generated expression with a custom function and in
that custom function's create function command attach a "set
search_path to public" clause.  That will prevent inlining and
also ensure the public schema is in the search_path when
executing the public.ll_to_earth function call.  With that in
place the empty search_path in the dump file will no longer matter.


Yeah, that was a typo. It seems weird that this behavior would be
broken by default though, is there anything that could fix it
upstream?


You saw and tried the work being done "upstream" to fix the situation. 
It's a big knot in the system and it isn't easy (or highly motivated) 
to untangle unfortunately...


David J.


Understood. Well, at least it was a fairly easy fix. Thanks for the help :)


Re: Errors when restoring backup created by pg_dumpall

2024-12-09 Thread Adrian Klaver

On 12/9/24 14:14, PopeRigby wrote:

On 12/7/24 11:58, David G. Johnston wrote:

On Sat, Dec 7, 2024 at 12:25 PM PopeRigby  wrote:


It actually looks like setting those all to have public fixed all the
errors, including the one with lldap. So, how can I get it to not put
public there automatically for next time?


I assume you mean "get it to put public there" (i.e., the "not" is a typo)

You cannot. The security team has decided to not permit an opt-in 
bypass of the lock-downs implemented to fix CVE-2018-1058.


Your only real choice at the moment is to replace the function call in 
the generated expression with a custom function and in that custom 
function's create function command attach a "set search_path to 
public" clause.  That will prevent inlining and also ensure the public 
schema is in the search_path when executing the public.ll_to_earth 
function call.  With that in place the empty search_path in the dump 
file will no longer matter.


David J.

Yeah, that was a typo. It seems weird that this behavior would be broken 
by default though, is there anything that could fix it upstream?




You could file an issue here:

https://www.postgresql.org/account/login/?next=/account/submitbug/

Ask if the developers could use the mechanisms available here:

https://www.postgresql.org/docs/current/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION

to schema qualify the objects in the extension.

Not sure if that will fly or not, but it is worth a shot.

--
Adrian Klaver
adrian.kla...@aklaver.com