Re: Logical replication fails when adding multiple replicas

2023-03-23 Thread Kyotaro Horiguchi
At Wed, 22 Mar 2023 09:25:37 +, Will Roper 
 wrote in 
> Thanks for the response Hou,
> 
> I've had a look and when the tablesync workers are spinning up there are
> some errors of the form:
> 
> "2023-03-17 18:37:06.900 UTC [4071] LOG:  logical replication table
> synchronization worker for subscription
> ""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
> started"
> "2023-03-17 18:37:06.976 UTC [4071] ERROR:  could not create replication
> slot ""pg_37986_sync_37922_7210774007126708177"": ERROR:  replication slot
> ""pg_37986_sync_37922_7210774007126708177"" already exists"

The slot name format is "pg__sync__". It's no
surprise this happens if the subscribers come from the same
backup.

If that's true, the simplest workaround would be to recreate the
subscription multiple times, using a different number of repetitions
for each subscriber so that the subscribers have subscriptions with
different OIDs.



I believe it's not prohitibed for subscribers to have the same system
identifer, but the slot name generation logic for tablesync doesn't
account for cases like this.  We might need some server-wide value
that's unique among subscribers and stable while table sync is
running.  I can't think of a better place than pg_subscription but I
don't like it because it's not really necessary most of the the
subscription's life.

Do you think using the postmaster's startup time would work for this
purpose?  I'm assuming that the slot name doesn't need to persist
across server restarts, but I'm not sure that's really true.


diff --git a/src/backend/replication/logical/tablesync.c 
b/src/backend/replication/logical/tablesync.c
index 07eea504ba..a5b4f7cf7c 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1214,7 +1214,7 @@ ReplicationSlotNameForTablesync(Oid suboid, Oid relid,
char 
*syncslotname, Size szslot)
 {
snprintf(syncslotname, szslot, "pg_%u_sync_%u_" UINT64_FORMAT, suboid,
-relid, GetSystemIdentifier());
+relid, PgStartTime);
 }
 
 /*


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Convert pg_constraint.conkey array to same-order array of column names

2023-03-23 Thread Dominique Devienne
We have a query returning 1 row per constraint column,
which until recently we didn't realize wasn't preserving order of the
columns.

A colleague fixed that, with something like below:

SELECT ...
FROM pg_catalog.pg_constraint cnstr
...
CROSS JOIN LATERAL UNNEST(cnstr.conkey) WITH ORDINALITY AS cols(value, rank)
 ORDER BY cols.rank

But I'm wondering about getting 1 row per constraint instead,
and fetching an array of column names.

So is there a way to "convert" int2[] conkey array into a text[] of those
column names?


Postgresql professional certification

2023-03-23 Thread MS A
Hi Team,

My name is Sabjar Ali working for one of the MNC companies, I am planning
to do Postgresql professional certification. Regarding the same i don't
find any link in postgresql.org site.

Please help me how i can approach to do my Postgresql professional
certification.

Thanks
Sabjar Ali


Re: Postgresql professional certification

2023-03-23 Thread Magnus Hagander
On Thu, Mar 23, 2023 at 1:46 PM MS A  wrote:

> Hi Team,
>
> My name is Sabjar Ali working for one of the MNC companies, I am planning
> to do Postgresql professional certification. Regarding the same i don't
> find any link in postgresql.org site.
>
> Please help me how i can approach to do my Postgresql professional
> certification.
>
>
Hello!

There is no official PostgreSQL certification.

Some PostgreSQL related companies may have their own, but there is no
standard one.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Convert pg_constraint.conkey array to same-order array of column names

2023-03-23 Thread Adrian Klaver

On 3/23/23 04:12, Dominique Devienne wrote:

We have a query returning 1 row per constraint column,
which until recently we didn't realize wasn't preserving order of the 
columns.


A colleague fixed that, with something like below:

SELECT ...
FROM pg_catalog.pg_constraint cnstr
...
CROSS JOIN LATERAL UNNEST(cnstr.conkey) WITH ORDINALITY AS cols(value, rank)
  ORDER BY cols.rank

But I'm wondering about getting 1 row per constraint instead,
and fetching an array of column names.

So is there a way to "convert" int2[] conkey array into a text[] of 
those column names?


A before coffee solution:

WITH ck AS (
SELECT
conrelid,
unnest(conkey) AS ky
FROM
pg_constraint
WHERE
conrelid = 'cell_per'::regclass
)
SELECT
array_agg(attname)
FROM
ck
JOIN
pg_attribute
ON
attnum = ck.ky AND ck.conrelid = pg_attribute.attrelid;

array_agg
-
{category,line_id,category}


--
Adrian Klaver
adrian.kla...@aklaver.com





How to install vacuumlo on a client?

2023-03-23 Thread Zwettler Markus (OIZ)
I want to install vacuumlo on a client.

I would install client + contrib + libs package:

yum install postgresql15 postgresql15-contrib postgresql15-libs

Is this enough or are there some other rpm / libs needed?



Re: How to install vacuumlo on a client?

2023-03-23 Thread Adrian Klaver

On 3/23/23 09:54, Zwettler Markus (OIZ) wrote:

I want to install vacuumlo on a client.

I would install client + contrib + libs package:

yum install postgresql15 postgresql15-contrib postgresql15-libs

Is this enough or are there some other rpm / libs needed?



That should cover it.

To test at OS command line run:

vacuumlo -V

--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Logical replication fails when adding multiple replicas

2023-03-23 Thread Will Roper
OK, that makes sense. I think something that is unique to subscribers is
sensible, postmaster startup time sounds reasonable!
Thanks for looking at it.

On Thu, Mar 23, 2023 at 8:17 AM Kyotaro Horiguchi 
wrote:

> At Wed, 22 Mar 2023 09:25:37 +, Will Roper <
> will.ro...@democracyclub.org.uk> wrote in
> > Thanks for the response Hou,
> >
> > I've had a look and when the tablesync workers are spinning up there are
> > some errors of the form:
> >
> > "2023-03-17 18:37:06.900 UTC [4071] LOG:  logical replication table
> > synchronization worker for subscription
> > ""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
> > started"
> > "2023-03-17 18:37:06.976 UTC [4071] ERROR:  could not create replication
> > slot ""pg_37986_sync_37922_7210774007126708177"": ERROR:  replication
> slot
> > ""pg_37986_sync_37922_7210774007126708177"" already exists"
>
> The slot name format is "pg__sync__". It's no
> surprise this happens if the subscribers come from the same
> backup.
>
> If that's true, the simplest workaround would be to recreate the
> subscription multiple times, using a different number of repetitions
> for each subscriber so that the subscribers have subscriptions with
> different OIDs.
>
>
>
> I believe it's not prohitibed for subscribers to have the same system
> identifer, but the slot name generation logic for tablesync doesn't
> account for cases like this.  We might need some server-wide value
> that's unique among subscribers and stable while table sync is
> running.  I can't think of a better place than pg_subscription but I
> don't like it because it's not really necessary most of the the
> subscription's life.
>
> Do you think using the postmaster's startup time would work for this
> purpose?  I'm assuming that the slot name doesn't need to persist
> across server restarts, but I'm not sure that's really true.
>
>
> diff --git a/src/backend/replication/logical/tablesync.c
> b/src/backend/replication/logical/tablesync.c
> index 07eea504ba..a5b4f7cf7c 100644
> --- a/src/backend/replication/logical/tablesync.c
> +++ b/src/backend/replication/logical/tablesync.c
> @@ -1214,7 +1214,7 @@ ReplicationSlotNameForTablesync(Oid suboid, Oid
> relid,
> char
> *syncslotname, Size szslot)
>  {
> snprintf(syncslotname, szslot, "pg_%u_sync_%u_" UINT64_FORMAT,
> suboid,
> -relid, GetSystemIdentifier());
> +relid, PgStartTime);
>  }
>
>  /*
>
>
> regards.
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>


Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?

2023-03-23 Thread Bryn Llewellyn
> laurenz.a...@cybertec.at wrote:
> 
>> b...@yugabyte.com wrote:
>> 
>>> laurenz.a...@cybertec.at wrote:
>>> 
>>> I recently used cursor scrollability, so I can show you a use case:
>>> 
>>> github.com/cybertec-postgresql/db_migrator/blob/master/db_migrator--1.0.0.sql#L49
>> 
>> However, source code famously reveals only what it does and not what the 
>> author's intention,
>> and overall design philosophy, is. I looked at the README accounts for these 
>> two functions here:
>> 
>> github.com/cybertec-postgresql/db_migrator#materialize_foreign_table
>> github.com/cybertec-postgresql/db_migrator#db_migrate_refresh
>> 
>> But these accounts say nothing about the design of their implementations.
> 
> The documentation describes the purpose of the function and how to use it, 
> not how I went about implementing it. You are interpreting too much. I just 
> wanted to show you a case where scrollable cursors can be useful.
> 

>> The upshot, therefore, is that I'm afraid that I can only guess at why you 
>> use "open, fetch, close" on a refcursor rather than an ordinary cursor for 
>> loop. After all, you use the technique only to traverse metadata tables 
>> about partitions, subpartitions, and columns. I'd be astonished if such 
>> tables have humongous numbers of rows (like hundreds of millions).
> 
> No, I don't expect large result sets, and concurrency won't be a problem. I 
> explained why I used scrollable cursors: to save myself an extra query for 
> the total result set count.

It's only natural to want to understand the design philosophy and things like 
data volumes and possible concurrency concerns when it's recommended that I 
look at the 200 line source code for a function. I wanted only to assure you 
that I'd done my due diligence to get some answers. Thanks for confirming my 
guess about the data volumes and concurrency concerns. Right, I do see that the 
approach that you used very naturally expresses what needs to be done.

— Create a txn-duration "pg_cursors" row for the outer "select" of interest.
— Traverse it to get the row count, using "move" so's not to see the rows.
— Fetch the first row and do something special that needs the row count as well 
as the row.
— Loop over the remaining rows and do more pre-row actions.
— Similar logic in an inner loop for a different "select".
— Close the cursors explicitly.

It seems to me (given what you promised me about data volumes and concurrency) 
that using explicit "select count(*)" with ordinary loops that have the 
"select" statements in the "for" header (and that would use the cursor 
mechanisms under the hood) would work correctly and without harming 
performance. But I'll agree that such code might seem to be less expressive of 
its purpose then your approach.

> Here is another account of how cursors can be useful:
> www.cybertec-postgresql.com/en/pagination-problem-total-result-count/

Thanks! It's a very interesting and helpful read.

You wrote « a WITH HOLD cursor... fetches the complete result set at the end of 
the transaction and materializes it on the server. »

I checked the current doc for the 'declare' statement. It says:

«
In the current implementation, the rows represented by a held cursor are copied 
into a temporary file or memory area so that they remain available for 
subsequent transactions.
»

I tested this with "table t(k int primary key, v int not null)" with 100 
Million rows. (I'm using psql's default mode for "AUTOCOMMIT" to ask for the 
server's behavior "as is"—so no explicit "start transaction" unless I need 
this. And I'm using a PostgreSQL server in a Ubuntu VM on my Mac Book with a 
psql client in the native macOS. First I timed a "with hold" cursor creation.

-- Took ~30 sec.
declare "With Hold" scroll cursor with hold for select k, v from t order by k;

Then I timed a "without hold" cursor creation for the same "select":

start transaction;
  -- Took less than ~20 ms
  declare "Without Hold" scroll cursor without hold for select k, v from t;

This certainly bears out what you and the doc said about "with hold".

W.r.t. this thread's "Subject", how do you fetch successive batches from a 
session duration cursor in successive requests from a browser when these go 
through a connection pool and the general philosophy is "check out connection, 
use it, release connection" so that successive browser requests will each use a 
different session? You refer to this with your « transaction level connection 
pooling won’t work with WITH HOLD cursors, since they are bound to the database 
connection ». (In this context, your point about showing an approximate result 
is well-taken.) Does anybody use interactive client-server apps these days? Or 
are you thinking of non-interactive report generation running on a client 
machine dedicated to that purpose?

W.r.t. your « With WITH HOLD cursor pagination, the whole result set is 
calculated in the beginning, so the result set count comes for free. », I timed 
this:

-- Took ~