Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Jan Behrens
On Fri, 27 Dec 2024 13:26:28 -0700
"David G. Johnston"  wrote:

> > Or is it documented somewhere?
> 
> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

I can't find any notes regarding functions and schemas in that section.

> Can someone explain to me what's going on, and what is the best practice to
> > deal with it? Is there a way to avoid fully qualifying every type and
> > expression? Which parts do I have to qualify or is this something that
> > could be fixed in a future version of PostgreSQL?
> >
> 
> Add qualification or attach a “set search_path” clause to “create
> function”.  Code stored in the server should not rely on the session
> search_path.
> 
> David J.

In my (real world) case, I was unable to use "SET search_path FROM
CURRENT" because it isn't possible to use "SET" in procedures that use
transactions, due to this documented limitation:

"If a SET clause is attached to a procedure, then that procedure cannot
execute transaction control statements (for example, COMMIT and
ROLLBACK, depending on the language)."

https://www.postgresql.org/docs/17/sql-createprocedure.html

My procedure looks more or less like this:

CREATE PROCEDURE "myfunc"()
  LANGUAGE plpgsql AS
  $$
  DECLARE
"old_search_path" TEXT;
-- some more variables
  BEGIN
SELECT current_setting('search_path') INTO "old_search_path";
SET search_path TO 'myschema';
-- some code that uses COMMIT and SET TRANSACTION ISOLATION LEVEL
PERFORM set_config('search_path', "old_search_path", FALSE);
  END;
  $$;

My question is: Am I safe if I use fully-qualified types in the DECLARE
section only? Or do I need to provide full qualification also in the
code below (after SET search_path TO 'myschema')?

And bonus question: Is it documented somewhere?

Maybe not many people run into these issues because schemas and
functions aren't used as often in combination?

Kind Regards
Jan Behrens




RE: Starting logical replication at arbitrary point that's available in WAL

2024-12-27 Thread Zhijie Hou (Fujitsu)
On Friday, December 27, 2024 7:39 PM Alexander Uvizhev  
wrote:

Hi,

> I'm doing a logical replication using streaming replication protocol and
> I'm trying to start a stream from a certain arbitrary point that's
> available in WAL. However, both CREATE_REPLICATION_SLOT and
> pg_create_logical_replication_slot() create slot with nearly last LSN/XID.
> 
> Is it possible to create a replication slot with arbitrary LSN/XID values?

I think it's not supported to specify arbitrary LSN/XID values for a newly
created slot in core. However, if you have an existing slot with an older 
LSN/XID, you
can copy it using pg_copy_logical_replication_slot, and then advance the copied
slot to your desired position with pg_replication_slot_advance.

> 
> Also pg_create_logical_replication_slot() for some reason gives
> different result than CREATE_REPLICATION_SLOT: new slot's `catalog_xmin`
> is set to the smallest `catalog_xmin` among already existing slots.
> Looks like a bug.

Could you provide a script to reproduce this issue ?
That would be helpful in diagnosing the reason.

Best Regards,
Hou zj


Re: Starting logical replication at arbitrary point that's available in WAL

2024-12-27 Thread Alexander Uvizhev

On 12/27/24 13:20, Zhijie Hou (Fujitsu) wrote:


On Friday, December 27, 2024 7:39 PM Alexander Uvizhev wrote:

Hi,


I'm doing a logical replication using streaming replication protocol and
I'm trying to start a stream from a certain arbitrary point that's
available in WAL. However, both CREATE_REPLICATION_SLOT and
pg_create_logical_replication_slot() create slot with nearly last LSN/XID.

Is it possible to create a replication slot with arbitrary LSN/XID values?

I think it's not supported to specify arbitrary LSN/XID values for a newly
created slot in core. However, if you have an existing slot with an older 
LSN/XID, you
can copy it using pg_copy_logical_replication_slot, and then advance the copied
slot to your desired position with pg_replication_slot_advance.
Thanks for the idea, I've already discovered it and it requires me to 
implement some script to advance that dedicated slot, which I was hoping 
to avoid.

Also pg_create_logical_replication_slot() for some reason gives
different result than CREATE_REPLICATION_SLOT: new slot's `catalog_xmin`
is set to the smallest `catalog_xmin` among already existing slots.
Looks like a bug.

Could you provide a script to reproduce this issue ?
That would be helpful in diagnosing the reason.


Unfortunately, I have no such script. But it looks like this:

test=> select slot_name, catalog_xmin, restart_lsn, confirmed_flush_lsn 
from pg_replication_slots;

  slot_name    | catalog_xmin | restart_lsn | confirmed_flush_lsn
---+--+-+-
 inactive_slot |    10073 | 0/1101DB88  | 0/1101DB88
 active_slot |    42607 | 0/17A87410  | 0/17A87410
 physical      |  | 0/17A875A0  |

test=> select pg_create_logical_replication_slot('test_slot', 
'decoderbufs');

  slot_name    | catalog_xmin | restart_lsn | confirmed_flush_lsn
---+--+-+-
inactive_slot|    10073 | 0/1101DB88  | 0/1101DB88
active_slot |    42607 | 0/17A87410  | 0/17A87410
 physical      |  | 0/17A875D8  |
 test_slot |    10073 | 0/17A875A0  | 0/17A875D8

While using CREATE_REPLICATION_SLOT would give me the same slot with 
`catalog_xmin` = 42607.


Hope this helps.



Best Regards,
Hou zj

--
AU

Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Pavel Stehule
pá 27. 12. 2024 v 22:03 odesílatel Tom Lane  napsal:

> "David G. Johnston"  writes:
> > It is what it is - and if one is not careful one can end up writing
> > hard-to-understand and possibly buggy code due to the various execution
> > environments and caches involved.
>
> Yeah, I don't see this changing.  The actual answer is that we have
> search_path-aware caching of expressions and query plans within a
> plpgsql function, which is why the call to foo() reacts to the current
> search path.  But the types of plpgsql variables are only looked up
> on the first use (within a session).  Perhaps we ought to work harder
> on that, but it seems like a lot of overhead to add for something that
> will benefit next to nobody.
>
> > I’ve never really understood why “%TYPE’ exists…
>
> Compatibility with Oracle, I imagine.  I agree it's a bizarre feature.
> But you could get the same behavior without %TYPE, just by referencing
> some other type that has different declarations in different schemas.
>

This feature is not bizarre - just the implementation in Postgres is not
fully complete (and I am not sure if it is fixable). PLpgSQL uses plan
cache, but there is nothing similar for types.
It is designed for Oracle where search_path doesn't exist, and where change
of schema invalidates code, and requires recompilation. PL/pgSQL and
Postgres are much more dynamic systems than Oracle. Maybe PL/pgSQL
functions can holds dependency on types, and when any related custom type
is changed, then the cached function can be invalidated. Unfortunately, the
frequent change of search path can kill the performance.


> > Add qualification or attach a “set search_path” clause to “create
> > function”.  Code stored in the server should not rely on the session
> > search_path.
>
> Yeah, adding "set search_path" is recommendable if you don't want to
> think hard about this stuff.
>
> regards, tom lane
>
>
>


Starting logical replication at arbitrary point that's available in WAL

2024-12-27 Thread Alexander Uvizhev

Hi,
I'm doing a logical replication using streaming replication protocol and 
I'm trying to start a stream from a certain arbitrary point that's 
available in WAL. However, both CREATE_REPLICATION_SLOT and 
pg_create_logical_replication_slot() create slot with nearly last LSN/XID.


Is it possible to create a replication slot with arbitrary LSN/XID values?

Also pg_create_logical_replication_slot() for some reason gives 
different result than CREATE_REPLICATION_SLOT: new slot's `catalog_xmin` 
is set to the smallest `catalog_xmin` among already existing slots. 
Looks like a bug.


I'm using PostgreSQL 16.6.

--
AU
||




Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Pavel Stehule
Hi

pá 27. 12. 2024 v 21:26 odesílatel David G. Johnston <
david.g.johns...@gmail.com> napsal:

> On Friday, December 27, 2024, Jan Behrens  wrote:
>>
>>
>> It seems that it matters *both* how the search_path was set during the
>> *first* invocation of the function within a session *and* how it is set
>> during the actual call of the function. So even if there are just two
>> schemas involved, there are 4 possible outcomes for the "run" function's
>> result ('2.4', '2', '5', and '5.4'). To me, this behavior seems to be
>> somewhat dangerous. Maybe it is even considered a bug?
>
>
> It is what it is - and if one is not careful one can end up writing
> hard-to-understand and possibly buggy code due to the various execution
> environments and caches involved.
>

I think plan cache should be invalidated when search_path is different, but
maybe there is some bug - there are some optimizations related to faster
execution of simple expressions.


> I’ve never really understood why “%TYPE’ exists…
>

referenced types should increase readability - it ensures type
compatibility - minimally on oracle, where the change of schema requires
recompilation. In Postgres it is working on 99% - plpgsql functions don't
hold dependency on types.


>
>> Or is it documented somewhere?
>
>
>
> https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
>
> Can someone explain to me what's going on, and what is the best practice
>> to deal with it? Is there a way to avoid fully qualifying every type and
>> expression? Which parts do I have to qualify or is this something that
>> could be fixed in a future version of PostgreSQL?
>>
>
> Add qualification or attach a “set search_path” clause to “create
> function”.  Code stored in the server should not rely on the session
> search_path.
>

a lot of functionality in Postgres depends on the search path - and then
all should be consistent. Sure, writing procedures that depend on the
current search path can be a short way to hell.

I cannot to reproduce it

CREATE OR REPLACE FUNCTION s1.fx1()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
begin
  return 100;
end
$function$

CREATE OR REPLACE FUNCTION s2.fx1()
 RETURNS integer
 LANGUAGE plpgsql
AS $function$
begin
  return 200;
end
$function$

CREATE OR REPLACE FUNCTION public.foo()
 RETURNS void
 LANGUAGE plpgsql
AS $function$
declare v int;
begin v := fx1();
  raise notice '%', v;
end;
$function$

(2024-12-27 21:53:13) postgres=# set search_path to s1;
SET
(2024-12-27 21:53:34) postgres=# select public.foo();
NOTICE:  100
┌─┐
│ foo │
╞═╡
│ │
└─┘
(1 row)

(2024-12-27 21:53:44) postgres=# set search_path to s2;
SET
(2024-12-27 21:53:47) postgres=# select public.foo();
NOTICE:  200
┌─┐
│ foo │
╞═╡
│ │
└─┘
(1 row)

(2024-12-27 21:53:48) postgres=# set search_path to s1;
SET
(2024-12-27 21:53:51) postgres=# select public.foo();
NOTICE:  100
┌─┐
│ foo │
╞═╡
│ │
└─┘
(1 row)

so from my perspective is pg ok, tested on pg16 and pg18






> David J.
>
>


Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Adrian Klaver

On 12/27/24 12:26, David G. Johnston wrote:
On Friday, December 27, 2024, Jan Behrens > wrote:



It seems that it matters *both* how the search_path was set during
the *first* invocation of the function within a session *and* how it
is set during the actual call of the function. So even if there are
just two schemas involved, there are 4 possible outcomes for the
"run" function's result ('2.4', '2', '5', and '5.4'). To me, this
behavior seems to be somewhat dangerous. Maybe it is even considered
a bug?


It is what it is - and if one is not careful one can end up writing 
hard-to-understand and possibly buggy code due to the various execution 
environments and caches involved.


I’ve never really understood why “%TYPE’ exists…


Per:

https://www.postgresql.org/docs/current/plpgsql-declarations.html#PLPGSQL-DECLARATION-TYPE

"By using %TYPE you don't need to know the data type of the structure 
you are referencing, and most importantly, if the data type of the 
referenced item changes in the future (for instance: you change the type 
of user_id from integer to real), you might not need to change your 
function definition.


%TYPE is particularly valuable in polymorphic functions, since the data 
types needed for internal variables can change from one call to the 
next. Appropriate variables can be created by applying %TYPE to the 
function's arguments or result placeholders."


The second case I can buy, the first I am not so sure of. It seems to me 
the first case it can be 'solved' by the second case.






Or is it documented somewhere? 



https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING
 


Can someone explain to me what's going on, and what is the best
practice to deal with it? Is there a way to avoid fully qualifying
every type and expression? Which parts do I have to qualify or is
this something that could be fixed in a future version of PostgreSQL?


Add qualification or attach a “set search_path” clause to “create 
function”.  Code stored in the server should not rely on the session 
search_path.


David J.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Tom Lane
"David G. Johnston"  writes:
> It is what it is - and if one is not careful one can end up writing
> hard-to-understand and possibly buggy code due to the various execution
> environments and caches involved.

Yeah, I don't see this changing.  The actual answer is that we have
search_path-aware caching of expressions and query plans within a
plpgsql function, which is why the call to foo() reacts to the current
search path.  But the types of plpgsql variables are only looked up
on the first use (within a session).  Perhaps we ought to work harder
on that, but it seems like a lot of overhead to add for something that
will benefit next to nobody.

> I’ve never really understood why “%TYPE’ exists…

Compatibility with Oracle, I imagine.  I agree it's a bizarre feature.
But you could get the same behavior without %TYPE, just by referencing
some other type that has different declarations in different schemas.

> Add qualification or attach a “set search_path” clause to “create
> function”.  Code stored in the server should not rely on the session
> search_path.

Yeah, adding "set search_path" is recommendable if you don't want to
think hard about this stuff.

regards, tom lane




Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread David G. Johnston
On Friday, December 27, 2024, Jan Behrens  wrote:
>
>
> It seems that it matters *both* how the search_path was set during the
> *first* invocation of the function within a session *and* how it is set
> during the actual call of the function. So even if there are just two
> schemas involved, there are 4 possible outcomes for the "run" function's
> result ('2.4', '2', '5', and '5.4'). To me, this behavior seems to be
> somewhat dangerous. Maybe it is even considered a bug?


It is what it is - and if one is not careful one can end up writing
hard-to-understand and possibly buggy code due to the various execution
environments and caches involved.

I’ve never really understood why “%TYPE’ exists…


> Or is it documented somewhere?



https://www.postgresql.org/docs/current/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING

Can someone explain to me what's going on, and what is the best practice to
> deal with it? Is there a way to avoid fully qualifying every type and
> expression? Which parts do I have to qualify or is this something that
> could be fixed in a future version of PostgreSQL?
>

Add qualification or attach a “set search_path” clause to “create
function”.  Code stored in the server should not rely on the session
search_path.

David J.


search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Jan Behrens
Hello,

I'm experiencing some weird issues when running the following code in a psql 
session:



CREATE TABLE "tbl" ("col" NUMERIC(15, 0));

CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
  BEGIN
RETURN '2.4';
  END;
$$;

BEGIN;

CREATE SCHEMA "myschema";
SET LOCAL search_path TO 'myschema';

CREATE TABLE "tbl" ("col" NUMERIC);

CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
  BEGIN
RETURN '5.4';
  END;
$$;

CREATE FUNCTION "run"() RETURNS TEXT LANGUAGE plpgsql AS $$
  DECLARE
"variable" "tbl"."col"%TYPE;
  BEGIN
"variable" := "foo"();
RETURN "variable";
  END;
$$;

COMMIT;

SELECT "myschema"."run"(); -- returns '2.4' (when run in the same session)

-- reconnect to database here:
\c

SELECT "myschema"."run"(); -- returns '2'
SET search_path TO 'myschema';
SELECT "myschema"."run"(); -- returns '5'

-- reconnect to database again:
\c

SET search_path TO 'myschema';
SELECT "myschema"."run"(); -- returns '5.4'
SET search_path TO 'public';
SELECT "myschema"."run"(); -- returns '2.4' again



I'm using PostgreSQL verison 16.4.

Is this the expected behavior? If yes, where is this documented? If no, what 
would be the expected behavior?

Of course, I could fix this by fully qualifying the table name "tbl" in the 
function. Nonetheless, I'm not really sure what's going on here.

It seems that it matters *both* how the search_path was set during the *first* 
invocation of the function within a session *and* how it is set during the 
actual call of the function. So even if there are just two schemas involved, 
there are 4 possible outcomes for the "run" function's result ('2.4', '2', '5', 
and '5.4'). To me, this behavior seems to be somewhat dangerous. Maybe it is 
even considered a bug? Or is it documented somewhere? I remember running into 
some problems like that in the past already, but unfortunately, I don't 
remember details.

I suppose this is because there is some caching mechanism in place. But 
apparently it only caches the "tbl"."col"%TYPE and not the "foo"() function 
call expression. Can someone explain to me what's going on, and what is the 
best practice to deal with it? Is there a way to avoid fully qualifying every 
type and expression? Which parts do I have to qualify or is this something that 
could be fixed in a future version of PostgreSQL?

Many thanks and kind regards,
Jan Behrens




Re: search_path for PL/pgSQL functions partially cached?

2024-12-27 Thread Pavel Stehule
Hi


> Maybe not many people run into these issues because schemas and
> functions aren't used as often in combination?
>

I think schema and functions are common combinations. But when people have
objects with the same name, then they are careful to be sure, so objects
have really identical structure.
Using different types in these objects is very rare. And because Postgres
doesn't support it well, experienced developers don't use it. Similar
issues can do some issues after an stored procedures update, because can
require session reset. Or when you need it, you can use a much more dynamic
type like record.


> Kind Regards
> Jan Behrens
>
>
>