Re: Unexpected results from a query with UNION ALL

2025-01-04 Thread Tom Lane
Andrey  writes:
> Recently I got unexpected results from a query that seems to be legit.
> ...
> ... Once I commit the concurrent query and
> release the lock, I get this:

> -- result 2
>child_id
> --
>  5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
>  5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
> (2 rows)

> but I would expect to get the same result as previously. Is it a bug
> or am I doing something wrong here?

It's a surprising result for sure, but I believe it's explained by
the algorithm for READ COMMITTED [1], specifically the bit about

The search condition of the command (the WHERE clause) is
re-evaluated to see if the updated version of the row still matches the
search condition.  If so, the second updater proceeds with its operation
using the updated version of the row.  

Once the tuple lock is released, the join query fetches the new
version of the parents row (with the new revision value, though that's
not actually relevant to the result).  It then effectively re-executes
the join against the UNION construct, and that means it'll always find
the first matching row in "children".  The "updated version of the
row" is taken to mean the entire join row, so it doesn't blink at the
fact that it got a different child output than it had started with.

Another way to look at this is that locking only "p" underspecifies
the query result: there's more than one child row that could join
to the "p" row, and the system doesn't promise that you get a result
from any particular one of them.

If you try to fix it by also locking the UNION result, or by adding
FOR UPDATE to the UNION arm that selects from "children", you get

ERROR:  FOR UPDATE is not allowed with UNION/INTERSECT/EXCEPT

This example makes me feel that we've missed some cases where we
probably ought to throw that error.  Or else work harder on making
the combination be supported --- but it looks tricky to produce
consistent results, and there have been few complaints about this
omission so far.

In the meantime, the most recommendable answer for you is probably
to switch over to using SERIALIZABLE mode.  That'd require adding
application logic to retry after a serialization failure, but it
would produce consistent results even for complex queries.

regards, tom lane

[1] 
https://www.postgresql.org/docs/current/transaction-iso.html#XACT-READ-COMMITTED




Re: name difference of rpm packages of postgresql-private-devel-16.6.z

2025-01-04 Thread Laurenz Albe
On Mon, 2024-12-30 at 16:53 +0530, Ravi Dave wrote:
> Out of curiosity, I wanted to know the meaning of "private" in 
> postgresql-private-devel-16.6.
> rpm package name. Ideally, I saw postgresql-devel-* rpm packages, so I want 
> to know if there
> is any specific reason for he word private in the name of rpm.

The "private" package was not downloaded from the PostgreSQL download page,
so we don't know anything about it.  Probably somebody packaged it for -
well - private use.

Yours,
Laurenz Albe




Re: search_path for PL/pgSQL functions partially cached?

2025-01-04 Thread David G. Johnston
On Saturday, January 4, 2025, Jan Behrens  wrote:
>
>
> Even if
>
> DECLARE "variable" "tbl"."col"%TYPE;
>
> follows *after* the schema is set to "myschema" in the example above, I
> still get differing results, depending on how the search_path was set
> when the function was first called.
>
> I think this has to do with the fact that the overall structure and
> probably types(?) are parsed first?
>

I concur that this dynamic doesn’t seem to be discussed.  Namely that in
the presence of nested blocks the parse phase resolves placeholders for all
declared variables without executing any expressions in the body of the
function; therefore all types will be resolved seeing the same search_path,
namely that of the calling session or established using SET.  Changing the
search_path within an outer function body block will not affect
declarations within an inner block. (I am not sure whether the for-loop
cases are exceptional in this.)

David J.


name difference of rpm packages of postgresql-private-devel-16.6.z

2025-01-04 Thread Ravi Dave
Hello,

Out of curiosity, I wanted to know the meaning of "private"
in postgresql-private-devel-16.6. rpm package name. Ideally, I saw
postgresql-devel-* rpm packages, so I want to know if there is any specific
reason for he word private in the name of rpm.

--
Regards,
Ravi Dave


Re: To uninstall or not to uninstall that is...

2025-01-04 Thread Adrian Klaver

On 1/4/25 03:08, Arbol One wrote:

Hello.
In my Debian 12 machine, I have PostgreSQL-15 installed; I'd like to 
upgrade to PostgreSQL-16. Should I first uninstalled PostgreSQL-15 and 
then install PostgreSQL-16 or is there a way to just do an upgrade?


Do you want to run both versions concurrently or do you want to move 
from 15 --> 16 and then drop the 15 instance?




Thanks.

--
*/ArbolOne ™/*


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: search_path for PL/pgSQL functions partially cached?

2025-01-04 Thread David G. Johnston
On Saturday, January 4, 2025, Jan Behrens  wrote:

>
> CREATE FUNCTION "foo_impl"("query_p" TEXT, "search_path_p" TEXT)
>   RETURNS "some_type"
>   LANGUAGE plpgsql SET search_path FROM CURRENT AS $$
> DECLARE
>   "old_search_path" TEXT;
>   "result" "some_type";
> BEGIN
>   "old_search_path" = current_setting('search_path');
>   PERFORM set_config('search_path', "search_path_p", TRUE);
>   EXECUTE "query_p" INTO "result";
>   PERFORM set_config('search_path', "old_search_path", TRUE);
>   RETURN "result";
> END;
>   $$;
>

You might consider adding a polymorphic argument for the result type.  Then
if you call the function with two different typed inputs it will be cached
once for each.

“ Likewise, functions having polymorphic argument types have a separate
statement cache for each combination of actual argument types they have
been invoked for, so that data type differences do not cause unexpected
failures.”

David J.


Re: search_path for PL/pgSQL functions partially cached?

2025-01-04 Thread David G. Johnston
On Saturday, January 4, 2025, Jan Behrens  wrote:

>
> I re-read section 41.11.2. on Plan Caching:
>
> "The PL/pgSQL interpreter parses the function's source text and
> produces an internal binary instruction tree the first time the
> function is called (within each session). The instruction tree fully
> translates the PL/pgSQL statement structure,
>

The type of a plpgsql variable is by definition its structure; established
in a statement, so this is actually covered by that paragraph.  But I would
be for adding a bit more specific terminology here.

David J.


Re: To uninstall or not to uninstall that is...

2025-01-04 Thread sivapostg...@yahoo.com
 I've installed PostgreSQL 12 and PostgreSQL 15 side-by-side in my Ubuntu 
Linux.  Both works fine till date.  
On Saturday 4 January, 2025 at 04:39:30 pm IST, Arbol One 
 wrote:  
 
   
Hello.
 In my Debian 12 machine, I have PostgreSQL-15 installed; I'd like to upgrade 
to PostgreSQL-16. Should I first uninstalled PostgreSQL-15 and then install 
PostgreSQL-16 or is there a way to just do an upgrade?
 
 Thanks.
 
 -- 
 ArbolOne ™ 
 Using Fire Fox and Thunderbird. 
 ArbolOne is composed of students and volunteers dedicated to providing free 
services to charitable organizations. 
 ArbolOne's development on Java, PostgreSQL, HTML and Jakarta EE is in progress 
[ í ]

Re: To uninstall or not to uninstall that is...

2025-01-04 Thread Peter J. Holzer
On 2025-01-04 06:08:58 -0500, Arbol One wrote:
> In my Debian 12 machine, I have PostgreSQL-15 installed; I'd like to upgrade 
> to
> PostgreSQL-16. Should I first uninstalled PostgreSQL-15 and then install
> PostgreSQL-16 or is there a way to just do an upgrade?

How did you install it? Package from the Debian repo, package from the
PGDG repo or something else?

The packages from the repos are designed to be installed in parallel and
upgraded via pg_upgrade. There is a utility script "pg_upgradecluster"
to aid in this.

So the typical procedure is 

1) Install new version
2) Drop the new (empty) database
3) Invoke pg_upgradecluster (see man-page for details)
4) Check that everything is ok
5) Drop old database and uninstall old version.

hp

-- 
   _  | Peter J. Holzer| Story must make more sense than reality.
|_|_) ||
| |   | h...@hjp.at |-- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |   challenge!"


signature.asc
Description: PGP signature


Re: search_path for PL/pgSQL functions partially cached?

2025-01-04 Thread Jan Behrens
On Fri, 3 Jan 2025 18:36:13 -0500
Isaac Morland  wrote:

> On Fri, 3 Jan 2025 at 18:22, Jan Behrens  wrote:
> 
> 
> > Instead, I plan to expect the function to receive a query string that
> > will get the data that is being processed by the function.
> >
> > That query string should be allowed to refer to tables in the
> > search_path at the caller's side.
> >
> > Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> > "CREATE FUNCTION" statement, because it would overwrite the current
> > search_path on each call of the function.
> >
> 
>  I wonder if it would help if EXECUTE took an optional search_path to use
> while executing the query.

That wouldn't solve my problem, because the function that includes the
EXECUTE still needs to know the search_path set on the caller side.

This only works if I omit the "SET search_path FROM CURRENT" option in
the function's definition OR if I pass a search_path as an argument. I
guess I could write a wrapper:



BEGIN;

CREATE SCHEMA "some_schema";
SET LOCAL search_path TO "some_schema";

CREATE TYPE "some_type" AS ("name" TEXT, "height" INT8);

CREATE FUNCTION "foo_impl"("query_p" TEXT, "search_path_p" TEXT)
  RETURNS "some_type"
  LANGUAGE plpgsql SET search_path FROM CURRENT AS $$
DECLARE
  "old_search_path" TEXT;
  "result" "some_type";
BEGIN
  "old_search_path" = current_setting('search_path');
  PERFORM set_config('search_path', "search_path_p", TRUE);
  EXECUTE "query_p" INTO "result";
  PERFORM set_config('search_path', "old_search_path", TRUE);
  RETURN "result";
END;
  $$;

CREATE FUNCTION "foo"("query_p" TEXT)
  RETURNS "some_type"
  RETURN "foo_impl"("query_p", current_setting('search_path'));

COMMIT;

CREATE TABLE "tbl" ("id" SERIAL8, "name" TEXT, "height" INT8);
INSERT INTO "tbl" ("name", "height") VALUES ('Alice', 200);

SELECT * FROM "some_schema"."foo"('SELECT "name" FROM "tbl"');



Not sure which variant (this or my previous attempt) is better and if
either is safe/correct.

Regards,
Jan Behrens




To uninstall or not to uninstall that is...

2025-01-04 Thread Arbol One

Hello.
In my Debian 12 machine, I have PostgreSQL-15 installed; I'd like to 
upgrade to PostgreSQL-16. Should I first uninstalled PostgreSQL-15 and 
then install PostgreSQL-16 or is there a way to just do an upgrade?


Thanks.

--
*/ArbolOne ™/*
Using Fire Fox and Thunderbird.
ArbolOne is composed of students and volunteers dedicated to providing 
free services to charitable organizations.
ArbolOne's development on Java, PostgreSQL, HTML and Jakarta EE is in 
progress [ í ]

Unexpected results from a query with UNION ALL

2025-01-04 Thread Andrey
Hi everyone,

Recently I got unexpected results from a query that seems to be legit.
The setup is like this:

-- setup
CREATE TABLE parents
(
id   uuidnot null primary key,
revision integer not null
);
CREATE TABLE children
(
iduuidnot null primary key,
parent_id uuidnot null references parents
);
INSERT INTO parents (id, revision)
VALUES ('ec422e09-55bb-4465-a990-31f59859959d', 1);
INSERT INTO children (id, parent_id)
VALUES ('5cb82ceb-c5ef-4c59-a02e-f7b610470f8c',
'ec422e09-55bb-4465-a990-31f59859959d');
INSERT INTO children (id, parent_id)
VALUES ('ce5b22b0-c6c4-4c09-826c-7086c53ee9ec',
'ec422e09-55bb-4465-a990-31f59859959d');

The query is:

-- query
SELECT children_union.id AS child_id
FROM parents p
 JOIN (SELECT id, parent_id
FROM children
UNION ALL
SELECT null::uuid, null::uuid
WHERE false) children_union ON
children_union.parent_id = p.id
WHERE p.id = 'ec422e09-55bb-4465-a990-31f59859959d'
FOR UPDATE OF p;

It looks weird, but it's just a simplification of a much bigger query.
The 'SELECT null::uuid, null::uuid WHERE false' part was actually more
meaningful but I substituted it with a query that returns 0 rows after
finding out that it's irrelevant.

If I just run this query I get something that I would expect to get:

-- result 1
   child_id
--
 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
 ce5b22b0-c6c4-4c09-826c-7086c53ee9ec
(2 rows)

But if I lock the single row in the parents table:

-- concurrent query
BEGIN;
UPDATE parents
SET revision = revision + 1
WHERE id = 'ec422e09-55bb-4465-a990-31f59859959d';

and then run my query again in a separate session, then it's waiting
for the lock to be released. Once I commit the concurrent query and
release the lock, I get this:

-- result 2
   child_id
--
 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
 5cb82ceb-c5ef-4c59-a02e-f7b610470f8c
(2 rows)

but I would expect to get the same result as previously. Is it a bug
or am I doing something wrong here?

Thank you,
Andrii


Re: search_path for PL/pgSQL functions partially cached?

2025-01-04 Thread Adrian Klaver

On 1/3/25 15:22, Jan Behrens wrote:

On Fri, 3 Jan 2025 13:56:02 -0800
Adrian Klaver  wrote:


At this point I am lost as to what the overall goal of this is.

Can you provide a 1 ft view if what it is you are trying to achieve?


Sure! I would like to create a component (e.g. a PostgreSQL extension)
that provides a function which processes some complex data, without
making any requirements regarding where the data is stored. To pass
this data to the function, I could use arrays of composite types, but
that seems to be very bulky. Another option would be to use cursors,
but that didn't turn out to work very smooth either.

Instead, I plan to expect the function to receive a query string that
will get the data that is being processed by the function.

That query string should be allowed to refer to tables in the
search_path at the caller's side.

Therefore, I cannot use the "SET search_path FROM CURRENT" in my
"CREATE FUNCTION" statement, because it would overwrite the current
search_path on each call of the function.

Thus my idea is to do this (simplified):

CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type"

   
   -- I cannot use SET search_path FROM CURRENT here, because "query_p"
   -- shall refer to tables in the search_path of the caller.
   

   LANGUAGE plpgsql AS $$
 DECLARE
   "old_search_path" TEXT;

   
   -- I have to fully qualify types in the DECLARE section.
   

   "some_variable" "some_schema"."some_type";
 BEGIN
   SELECT current_setting('search_path') INTO "old_search_path";
   PERFORM set_config(
 'search_path',
 'some_schema, pg_temp, ' || "old_search_path",
 TRUE
   );

   
   -- Do I have to fully qualify types and operators from
   -- "myschema" here? Or is it safe to not fully qualify them?
   
 END;
   $$;

That is my overall idea.


Is 'some_schema' a known item when installing?

Once you have the search_path defined and assuming all the objects you 
want are in that path, then yes you can drop the schema qualification.




My problem is that I'm confused about WHEN EXACTLY I have to qualify
tables/types, etc. It is very hard to understand from reading (just) the
documentation.


If you are doing this as an extension then I suspect you want the 
processes shown here:


https://www.postgresql.org/docs/17/extend-extensions.html#EXTEND-EXTENSIONS-RELOCATION




Kind Regards,
Jan Behrens


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: search_path for PL/pgSQL functions partially cached?

2025-01-04 Thread Jan Behrens
On Sat, 4 Jan 2025 09:37:14 -0800
Adrian Klaver  wrote:

> On 1/3/25 15:22, Jan Behrens wrote:
> > On Fri, 3 Jan 2025 13:56:02 -0800
> > Adrian Klaver  wrote:
> > 
> >> At this point I am lost as to what the overall goal of this is.
> >>
> >> Can you provide a 1 ft view if what it is you are trying to achieve?
> > 
> > Sure! I would like to create a component (e.g. a PostgreSQL extension)
> > that provides a function which processes some complex data, without
> > making any requirements regarding where the data is stored. To pass
> > this data to the function, I could use arrays of composite types, but
> > that seems to be very bulky. Another option would be to use cursors,
> > but that didn't turn out to work very smooth either.
> > 
> > Instead, I plan to expect the function to receive a query string that
> > will get the data that is being processed by the function.
> > 
> > That query string should be allowed to refer to tables in the
> > search_path at the caller's side.
> > 
> > Therefore, I cannot use the "SET search_path FROM CURRENT" in my
> > "CREATE FUNCTION" statement, because it would overwrite the current
> > search_path on each call of the function.
> > 
> > Thus my idea is to do this (simplified):
> > 
> > CREATE FUNCTION "some_function" ("query_p" TEXT) RETURNS "some_type"
> > 
> >
> >-- I cannot use SET search_path FROM CURRENT here, because "query_p"
> >-- shall refer to tables in the search_path of the caller.
> >
> > 
> >LANGUAGE plpgsql AS $$
> >  DECLARE
> >"old_search_path" TEXT;
> > 
> >
> >-- I have to fully qualify types in the DECLARE section.
> >
> > 
> >"some_variable" "some_schema"."some_type";
> >  BEGIN
> >SELECT current_setting('search_path') INTO "old_search_path";
> >PERFORM set_config(
> >  'search_path',
> >  'some_schema, pg_temp, ' || "old_search_path",
> >  TRUE
> >);
> > 
> >
> >-- Do I have to fully qualify types and operators from
> >-- "myschema" here? Or is it safe to not fully qualify them?
 (correction: "some_schema")
> >
> >  END;
> >$$;
> > 
> > That is my overall idea.
> 
> Is 'some_schema' a known item when installing?

Yes, fortunately "some_schema" is a fixed name.

> 
> Once you have the search_path defined and assuming all the objects you 
> want are in that path, then yes you can drop the schema qualification.

That would be nice, but it doesn't seem to be the case. At least not
always. I constructed the following new example:



CREATE TABLE "tbl" ("col" NUMERIC(15, 0));

CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
  BEGIN
RETURN '2.4';
  END;
$$;

BEGIN;

CREATE SCHEMA "myschema";
SET LOCAL search_path TO 'myschema';

CREATE TABLE "tbl" ("col" NUMERIC);

CREATE FUNCTION "foo"() RETURNS TEXT LANGUAGE plpgsql AS $$
  BEGIN
RETURN '5.4';
  END;
$$;

CREATE FUNCTION "run"() RETURNS TEXT LANGUAGE plpgsql AS $$
  DECLARE
"old_search_path" TEXT;
  BEGIN
"old_search_path" := current_setting('search_path');
SET LOCAL search_path TO "myschema";
-- At this point, search_path is always set to 'myschema'!
DECLARE
  "variable" "tbl"."col"%TYPE;
BEGIN
  "variable" := "foo"();
  RETURN "variable";
END;
PERFORM set_config('search_path', "old_search_path", TRUE);
  END;
$$;

COMMIT;

SELECT "myschema"."run"(); -- returns '5.4' (when run in the same session)

-- reconnect to database here:
\c

SELECT "myschema"."run"(); -- returns '5'
SET search_path TO 'myschema';
SELECT "myschema"."run"(); -- returns '5'

-- reconnect to database again:
\c

SET search_path TO 'myschema';
SELECT "myschema"."run"(); -- returns '5.4'
SET search_path TO 'public';
SELECT "myschema"."run"(); -- returns '5.4'



Even if

DECLARE "variable" "tbl"."col"%TYPE;

follows *after* the schema is set to "myschema" in the example above, I
still get differing results, depending on how the search_path was set
when the function was first called.

I think this has to do with the fact that the overall structure and
probably types(?) are parsed first?

As Tom Lane wrote on Fri, 27 Dec 2024 16:03:17 -0500, "the types of
plpgsql variables are only looked up on the first use (within a
session)."

Does this apply to *all* types (e.g. types used in type-casts in
statements after BEGIN)? Or does it only apply to types in the DECLARE
section?

Maybe my most recent example is somewhat "crafted", but it makes me
feel insecure about what I can rely on. Could someone explain to me
wh