Re: Long running query causing XID limit breach

2024-06-05 Thread Simon Elbaz
Hi,

I am following this very interesting thread.

>From the documentation
https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT,
the 0 value will disable the timeout (not -1).



On Wed, Jun 5, 2024 at 8:25 AM sud  wrote:

> Hello Laurenz,
>
> Thank you so much.This information was really helpful for us
> understanding the working of these parameters.
>
> One follow up question i have , as we are setting one of the
> standby/replica with value idle_in_transaction_session_timeout=-1 which can
> cause the WAL's to be heavily backlogged in a scenario where we have a
> query running for very long time on that instance. So in that case will
> there be chances of instance restart and if that can be avoided anyway?
>
> And the plan is to set these system parameters with different values in
> writer/read replica , so in that case if we apply the "alter system"
> command on the primary , won't the WAL going to apply those same commands
> forcibly on reader instance making those same as the writer instance
> configuration( but we want the reader replica configuration to be different
> from writer)?
>
> Appreciate your guidance.
>
> On Wed, May 29, 2024 at 1:38 PM Laurenz Albe 
> wrote:
>
>> On Wed, 2024-05-29 at 01:34 +0530, yudhi s wrote:
>> > > The only way you can have no delay in replication AND no canceled
>> queries is
>> > > if you use two different standby servers with different settings for
>> > > "max_standby_streaming_delay".  One of the server is for HA, the
>> other for
>> > > your long-running queries.
>> >
>> > When you suggest having different max_standby_streaming_delay for first
>> replica
>> > (say 10 sec for High availability) and second replica(say -1 for long
>> running queries).
>> > Do you also suggest  keeping "hot_feedback_standby" as "OFF" for all
>> the three
>> > instances i.e. master and both the replicas?
>>
>> The parameter is ignored on the master.
>> It needs to be off on the standby that is running long queries.
>> For the other standby it probably doesn't matter if you are not running
>> any
>> queries on it.  I would leave "hot_standby_feedback = off" there as well.
>>
>> Actually, I would set "hot_standby = off" on the standby that is only used
>> for HA.
>>
>>
>> - I would leave "hot_standby_feedback" off everywhere.
>> - "max_standby_streaming_delay" should be -1 on the reporting standby and
>> very
>>   low or 0 on the HA standby. It doesn't matter on the primary.
>> - "statement_timeout" should be way lower on the first two nodes.
>> - "idle_in_transaction_session_timeout" is good.
>> - I would leave "autovacuum_freeze_max_age" at the default setting but
>> 100 million
>>   is ok too.
>>
>> Yours,
>> Laurenz Albe
>>
>


Re: Long running query causing XID limit breach

2024-06-05 Thread sud
On Wed, 5 Jun, 2024, 12:39 pm Simon Elbaz,  wrote:

> Hi,
>
> I am following this very interesting thread.
>
> From the documentation
> https://www.postgresql.org/docs/current/runtime-config-client.html#GUC-IDLE-IN-TRANSACTION-SESSION-TIMEOUT,
> the 0 value will disable the timeout (not -1).
>
>
>
> On Wed, Jun 5, 2024 at 8:25 AM sud  wrote:
>
>> Hello Laurenz,
>>
>> Thank you so much.This information was really helpful for us
>> understanding the working of these parameters.
>>
>> One follow up question i have , as we are setting one of the
>> standby/replica with value idle_in_transaction_session_timeout=-1 which can
>> cause the WAL's to be heavily backlogged in a scenario where we have a
>> query running for very long time on that instance. So in that case will
>> there be chances of instance restart and if that can be avoided anyway?
>>
>> And the plan is to set these system parameters with different values in
>> writer/read replica , so in that case if we apply the "alter system"
>> command on the primary , won't the WAL going to apply those same commands
>> forcibly on reader instance making those same as the writer instance
>> configuration( but we want the reader replica configuration to be different
>> from writer)?
>>
>> Appreciate your guidance.
>>
>>
>
My apologies. I was meant to say setting up "max_standby_streaming_delay"
To -1. Which means unlimited lag.

>


Re: Postgresql 16.3 Out Of Memory

2024-06-05 Thread Radu Radutiu
It looks like I did not copy the list.

I did run VACUUM ANALYZE after the upgrade. I've even run it now
> before getting a new explain plan that is very similar (just the costs
> differ):
>  Gather Merge  (cost=12336145.92..16111570.23 rows=31531596 width=66)
>Workers Planned: 4
>->  Sort  (cost=12335145.86..12354853.11 rows=7882899 width=66)
>  Sort Key: t.msg_status DESC, t.input_sequence
>  ->  Parallel Hash Right Join  (cost=9376528.66..11242773.26
> rows=7882899 width=66)
>Hash Cond: (snd_tro.reply_input_sequence = t.input_sequence)
>->  Parallel Seq Scan on tbl_outputrequest snd_tro
>  (cost=0.00..1431919.45 rows=20057645 width=16)
>->  Parallel Hash  (cost=9254599.76..9254599.76
> rows=5245992 width=58)
>  ->  Parallel Hash Right Join
>  (cost=7458636.44..9254599.76 rows=5245992 width=58)
>Hash Cond: (rec_tro.input_sequence =
> t.input_sequence)
>->  Parallel Seq Scan on tbl_outputrequest
> rec_tro  (cost=0.00..1431919.45 rows=20057645 width=16)
>->  Parallel Hash  (cost=7380902.99..7380902.99
> rows=3491156 width=50)
>  ->  Parallel Hash Right Join
>  (cost=5592677.17..7380902.99 rows=3491156 width=50)
>Hash Cond:
> (rpl_snd_tro.reply_input_sequence = r.input_sequence)
>->  Parallel Seq Scan on
> tbl_outputrequest rpl_snd_tro  (cost=0.00..1431919.45 rows=20057645
> width=16)
>->  Parallel Hash
>  (cost=5518353.72..5518353.72 rows=3491156 width=42)
>  ->  Parallel Hash Right Join
>  (cost=3729209.40..5518353.72 rows=3491156 width=42)
>Hash Cond:
> (rpl_rec_tro.input_sequence = r.input_sequence)
>->  Parallel Seq Scan
> on tbl_outputrequest rpl_rec_tro  (cost=0.00..1431919.45 rows=20057645
> width=16)
>->  Parallel Hash
>  (cost=3658294.95..3658294.95 rows=3491156 width=34)
>  ->  Parallel Hash
> Right Join  (cost=1883503.35..3658294.95 rows=3491156 width=34)
>Hash Cond:
> (r.originalrequest_id = t.input_sequence)
>->
>  Parallel Seq Scan on tbl_inputrequest r  (cost=0.00..1739752.66
> rows=13348166 width=16)
>->
>  Parallel Hash  (cost=1839863.91..1839863.91 rows=3491156 width=26)
>  ->
>  Parallel Seq Scan on tbl_inputrequest t  (cost=0.00..1839863.91
> rows=3491156 width=26)
>
>  Filter: ((receive_time < '2024-05-17 00:00:00'::timestamp without time
> zone) AND (input_sequence < '202406020168279904'::bigint) AND
> ((msg_status)::text = ANY ('{COMPLETED,REJECTED}'::text[])))
>
> The query cost is high and it returns a significant number of rows.
> However it should not consume 64+ GB RAM with the
> default enable_parallel_hash = 'on' when my shared_buffers is 8GB. The
> temporary fix I've implemented to get the system working is a change in the
> application configuration so that the timestamp filter selects fewer rows
> (3x less), together with setting enable_parallel_hash = 'off'. PostgreSQL
> service memory usage grows and stays over 20GB even with this setting. I'd
> like to find out exactly what causes the high memory usage as we have other
> projects using PostgreSQL and they are scheduled for upgrade from v12.
>
> My test.sql looks like this (application uses prepared statements, the two
> set operations are to revert for the current session the already
> implemented fixes):
>
> prepare my_query (timestamp,bigint) as SELECT  t.input_sequence,
> rec_tro.output_sequence, r.input_sequence, rpl_rec_tro.output_sequence,
> rpl_snd_tro.output_sequence, snd_tro.output_sequence, t.msg_type  FROM
> inputrequest t  LEFT JOIN outputrequest rec_tro ON rec_tro.input_sequence =
> t.input_sequence LEFT JOIN inputrequest r ON r.originalRequest_id =
> t.input_sequence   LEFT JOIN outputrequest rpl_rec_tro ON
> rpl_rec_tro.input_sequence = r.input_sequence  LEFT JOIN outputrequest
> rpl_snd_tro ON rpl_snd_tro.reply_input_sequence = r.input_sequence  LEFT
> JOIN outputrequest snd_tro ON snd_tro.reply_input_sequence =
> t.input_sequence  WHERE t.receive_time < $1 AND t.input_sequence < $2  AND
> t.msg_status IN ('COMPLETED', 'REJECTED')  ORDER BY t.msg_status DESC,
> t.input_sequence ;
>
> EXPLAIN  EXECUTE my_query('2024-05-17 00:00:00', 202406020168279904);
>
> I have an explanation for what I suspected was a memory leak. It seems
that systemd reports cached memory, the relevant par

Poor performance after restoring database from snapshot on AWS RDS

2024-06-05 Thread Sam Kidman
We keep the staging environment of our application up to date with
respect to production data by creating a new RDS instance for the
staging environment and restoring the most recent production snapshot
into it.

We get very poor performance in the staging environment after this
restore takes place - after some usage it seems to get better perhaps
because of caching.

The staging RDS instance is a smaller size than production (it has
32GB ram and 8 vCPU vs production's 128GB ram and 32 vCPU) but the
performance seems to much worse than this decrease in resources would
account for.

I have seen some advice that vacuum analyze should be run after the
snapshot restore but I thought this was supposed to happen
automatically. If we did run it manually how would that help?

Are there any other tools in postgres we can use to figure out why it
might be so much slower?

Best

--

Sam Kidman

Web Developer

Melbourne




Logical replication type- WAL recovery fails and changes the size of wal segment in archivedir

2024-06-05 Thread Meera Nair
Hi team,

With wal_level = 'logical', backup was taken using non-exclusive backup method.
Following procedure here for restore and recovery - PostgreSQL: Documentation: 
16: 26.3. Continuous Archiving and Point-in-Time Recovery 
(PITR)

While starting the PostgreSQL server, below issue is seen:

2024-06-05 11:41:32.369 IST [54369] LOG:  restored log file 
"00050001006A" from archive
2024-06-05 11:41:33.112 IST [54369] LOG:  restored log file 
"00050001006B" from archive
cp: cannot stat '/home/pgsql/wmaster/00050001006C': No such file or 
directory
2024-06-05 11:41:33.167 IST [54369] LOG:  redo done at 1/6B000100
2024-06-05 11:41:33.172 IST [54369] FATAL:  archive file 
"00050001006B" has wrong size: 0 instead of 16777216
2024-06-05 11:41:33.173 IST [54367] LOG:  startup process (PID 54369) exited 
with exit code 1
2024-06-05 11:41:33.173 IST [54367] LOG:  terminating any other active server 
processes
2024-06-05 11:41:33.174 IST [54375] FATAL:  archive command was terminated by 
signal 3: Quit
2024-06-05 11:41:33.174 IST [54375] DETAIL:  The failed archive command was: cp 
pg_wal/00050001006B /home/pgsql/wmaster/00050001006B
2024-06-05 11:41:33.175 IST [54367] LOG:  archiver process (PID 54375) exited 
with exit code 1
2024-06-05 11:41:33.177 IST [54367] LOG:  database system is shut down

Here '/home/pgsql/wmaster' is my archivedir (the folder where WAL segments are 
restored from)

Before attempting start, size of 00050001006B file was 16 MB.
After failing to detect 00050001006C, there is a FATAL error saying 
wrong size for 00050001006B
Now the size of 00050001006B is observed as 2 MB. Size of all other 
WAL segments remain 16 MB.

-rw--- 1 postgres postgres  2359296 Jun  5 11:34 00050001006B

Why is it changing the size of WAL segment in archive log directory?
All necessary WAL segments are present and 00050001006C was never 
archived.

bash-4.2$ cat /home/pgsql/dmaster/backup_label.old
START WAL LOCATION: 1/6928 (file 000500010069)
CHECKPOINT LOCATION: 1/6960
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2024-05-31 17:39:43 IST
LABEL: pgida_backup_4321_315606_1717157383
START TIMELINE: 5

bash-4.2$ cat /home/pgsql/wmaster/00050001006B.0028.backup
START WAL LOCATION: 1/6B28 (file 00050001006B)
STOP WAL LOCATION: 1/6B000100 (file 00050001006B)
CHECKPOINT LOCATION: 1/6B60
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2024-05-31 17:40:28 IST
LABEL: pgida_backup_4321_315606_1717157427
START TIMELINE: 5
STOP TIME: 2024-05-31 17:40:28 IST
STOP TIMELINE: 5

bash-4.2$ cat /home/pgsql/wmaster/0005.history
1   0/3E00  before 2000-01-01 05:30:00+05:30
2   0/6300  before 2000-01-01 05:30:00+05:30
3   0/E800  no recovery target specified
4   1/6800  before 2000-01-01 05:30:00+05:30


Despite our efforts to troubleshoot, the problem persists. Please help to find 
a solution.

Regards,
Meera



Re: Poor performance after restoring database from snapshot on AWS RDS

2024-06-05 Thread Shammat


Sam Kidman schrieb am 03.06.2024 um 10:06:
> We get very poor performance in the staging environment after this
> restore takes place - after some usage it seems to get better perhaps
> because of caching.
>
> The staging RDS instance is a smaller size than production (it has
> 32GB ram and 8 vCPU vs production's 128GB ram and 32 vCPU) but the
> performance seems to much worse than this decrease in resources would
> account for.
>
> I have seen some advice that vacuum analyze should be run after the
> snapshot restore but I thought this was supposed to happen
> automatically. If we did run it manually how would that help?

autovacuum will kick in eventually - but only after some time (which
is what you are seeing).

In general after a bulk load (e.g. restore of a backup or importing
data in any other way) running vacuum to udpate statistics is highly
recommended




Variant (Untyped) parameter for function/procedure

2024-06-05 Thread Durumdara
Dear Members!

As I experienced, the functions/procedures extremely depend on parameters
(very typed).
So if I have to restructure the input parameters, I can't modify the
function, because I have to recreate the dependents too.
For example:
I have a type. If I pass this type to a function, I can't change the
structure of the type without dropping and recreating the function.

create type blahtype as (a int, b bool);
create function blahcheck (input blahtype) ...


If I have many dependent functions this causes that I have to drop
(recreate) everything - just for an extra parameter. And sometimes this
extra parameter doesn't change 10 functions, only one. But I have to
recreate them all (without changing the body).

Is there any way to use a Variant parameter?
Like this:

create procedure test(IN Input Record, OUT Output Record)
...
Time = Input.Time::timestamp;
...

Output = SomeHowMakeItTyped;

Output.Result = 1;

...


Caller:

...
for r_in as select  id, name from blah into
...
 test(r_in, r_out);
 if r_out.result <> 0 then ...

Or:

create procedure test(IN Inputs Records, OUT Output Record)
...
for Input in Inputs:
Time = Input.Time::timestamp;
...
Output.Result = 1;
...




Or is it impossible, because of the PGSQL's philosophy (very typed)?

If it is not possible then I have one way I think.
It is a JSON data type for inputs and outputs.

Is that right?

Thanks for your any help, info!

Best regards
dd


Re: Unable to connect to any data source for foreign server

2024-06-05 Thread Russell Mercer

Hi,

Getting back to this after a bit.  I am still having no luck.  I checked 
the settings for the postgres windows user between the new and old 
servers, and ensure they are matching.  The user has full control over 
the directory the file I want to connect to is located in.


Here are the results from running ogr_fdw_info:

CREATE SERVER myserver
  FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (
    datasource '\\coib-gis\City_GIS\Data\Assessor\Current\CityofIB.xlsx',
    format 'XLSX');

CREATE FOREIGN TABLE cityofib (
  fid bigint,
  pn bigint,
  ckdigit varchar,
  tra integer,
  contrctcd integer,
  frctint integer,
  marstat varchar,
  ownstat varchar,
  ownname varchar,
  mailaddr varchar,
  zip integer,
  mlchgdt integer,
  stsno varchar,
  stsstreet varchar,
  prpcd integer,
  prpdesc varchar,
  cvnet integer,
  cvland integer,
  imps integer,
  pprop integer,
  applyr varchar,
  xcd1 integer,
  xamt1 integer,
  xcd2 varchar,
  xamt2 integer,
  xcd3 varchar,
  xamt3 integer,
  doctype integer,
  docno integer,
  docdat integer,
  mapno varchar,
  origcutno integer,
  origcutdt integer,
  tracutno integer,
  tracutdt integer,
  redrftno integer,
  redrftdt integer,
  oldtra integer,
  oldpn bigint,
  acre double precision,
  units integer,
  zone integer,
  usecode integer,
  srcd varchar,
  sryr integer,
  taxstst varchar,
  assmtyr integer,
  transdt integer,
  yreffct integer,
  tlvgarea integer,
  bedrms integer,
  baths integer,
  gstalls integer,
  pool varchar,
  view varchar,
  usqft integer,
  asqft integer
) SERVER myserver
OPTIONS (layer 'CityofIB');

When I run the command given above to create the server, I receive the 
following error:


ERROR:  unable to connect to data source 
"\\coib-gis\City_GIS\Data\Assessor\Current\CityofIB.xlsx"


SQL state: HV00N

This same error occurs whether I try to connect to an XLSX or even just 
a CSV.  Any ideas would be greatly appreciated.


Thanks,

Russell

On 5/29/2024 3:15 AM, Muhammad Salahuddin Manzoor wrote:

Greetings,

While configuring ORG_FWD.
I hope you are setting path like server name and absolute path to file 
on your network share and network share has appropriate permissions.


'\\remote_server\path\to\file.xlsx' on network share

Create user mapping if required.

check your config through
ogr_fdw_info -s "remote_server\\path\\to\\file.xlsx" -l

To diagnose the issue you can check permissions, network 
accessability, path format should include server name and absolute 
file path and any compatability issues.


Regards,
Salahuddin.

On Wed, 29 May 2024, 14:42 Russell Mercer,  wrote:

Hi,

I'm running into a problem connecting to a foreign server with my
Postgres database.  I am using the OGR_FDW trying to connect to an
Excel
"XLSX".  I had this database on another server, and the foreign
server
was set up and working properly there.  I am trying to change the
path
to the file which it is referencing, and I am now receiving an error.
In addition, I tried to just create a new foreign server to that same
"XLSX" file and it also gave the error of "Failed to connect to data
source".

I used the OGR_FDW_INfo tool to make sure I had the foreign server
reference structured correctly, but it matches what I had before, and
didn't make a difference.

I'm a bit lost as to where to look for possible solutions. One idea I
have is that perhaps it is a permissions issue with accessing other
files on the server where Postgres is installed, with the
"postgres" user.

Some basic info:

Server:  Windows Server 2022
PostgreSQL 16.2 - Installed using Stack Builder

Any information or ideas as to a solution, would be very much
appreciated.

Thanks,
Russell




Re: Long running query causing XID limit breach

2024-06-05 Thread Laurenz Albe
On Wed, 2024-06-05 at 13:09 +0530, sud wrote:
> > > One follow up question i have , as we are setting one of the 
> > > standby/replica
> > > with value idle_in_transaction_session_timeout=-1 which can cause the 
> > > WAL's
> > > to be heavily backlogged in a scenario where we have a query running for 
> > > very
> > > long time on that instance. So in that case will there be chances of 
> > > instance
> > > restart and if that can be avoided anyway?
> 
> My apologies. I was meant to say setting up "max_standby_streaming_delay" To 
> -1.
> Which means unlimited lag. 

There should never be a restart unless you perform one or the standby crashes.
If you mean that you want to avoid a crash caused by a full disk on the standby,
the answer is probably "no".  Make sure that you have enough disk space and
use monitoring.

Yours,
Laurenz Albe




Re: Logical replication type- WAL recovery fails and changes the size of wal segment in archivedir

2024-06-05 Thread Laurenz Albe
On Wed, 2024-06-05 at 06:36 +, Meera Nair wrote:
> 2024-06-05 11:41:32.369 IST [54369] LOG:  restored log file 
> "00050001006A" from archive
> 2024-06-05 11:41:33.112 IST [54369] LOG:  restored log file 
> "00050001006B" from archive
> cp: cannot stat ‘/home/pgsql/wmaster/00050001006C’: No such file 
> or directory
> 2024-06-05 11:41:33.167 IST [54369] LOG:  redo done at 1/6B000100   
> 2024-06-05 11:41:33.172 IST [54369] FATAL:  archive file 
> "00050001006B" has wrong size: 0 instead of 16777216
> 2024-06-05 11:41:33.173 IST [54367] LOG:  startup process (PID 54369) exited 
> with exit code 1
> 2024-06-05 11:41:33.173 IST [54367] LOG:  terminating any other active server 
> processes
> 2024-06-05 11:41:33.174 IST [54375] FATAL:  archive command was terminated by 
> signal 3: Quit
> 2024-06-05 11:41:33.174 IST [54375] DETAIL:  The failed archive command was: 
> cp pg_wal/00050001006B 
> /home/pgsql/wmaster/00050001006B
> 2024-06-05 11:41:33.175 IST [54367] LOG:  archiver process (PID 54375) exited 
> with exit code 1
> 2024-06-05 11:41:33.177 IST [54367] LOG:  database system is shut down
>  
> Here ‘/home/pgsql/wmaster’ is my archivedir (the folder where WAL segments 
> are restored from)
>  
> Before attempting start, size of 
> 00050001006B file was 16 MB.
> After failing to detect 00050001006C, there is a FATAL error 
> saying wrong size for 00050001006B
> Now the size of 00050001006B is observed as 2 MB. Size of all 
> other WAL segments remain 16 MB.
>  
> -rw--- 1 postgres postgres  2359296 Jun  5 11:34 00050001006B

That looks like you have "archive_mode = always", and "archive_command" writes
back to the archive.  Don't do that.

Yours,
Laurenz Albe




Re: Poor performance after restoring database from snapshot on AWS RDS

2024-06-05 Thread Jeremy Smith
On Wed, Jun 5, 2024 at 4:23 AM Sam Kidman  wrote:

> We get very poor performance in the staging environment after this
> restore takes place - after some usage it seems to get better perhaps
> because of caching.
>

This is due to the way that RDS restores snapshots.

>From the docs 
>(https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_RestoreFromSnapshot.html):

You can use the restored DB instance as soon as its status is
available. The DB instance continues to load data in the background.
This is known as lazy loading.

If you access data that hasn't been loaded yet, the DB instance
immediately downloads the requested data from Amazon S3, and then
continues loading the rest of the data in the background.



  -Jeremy




Re: Purpose of pg_dump tar archive format?

2024-06-05 Thread Gavin Roy
On Tue, Jun 4, 2024 at 7:36 PM Ron Johnson  wrote:

> On Tue, Jun 4, 2024 at 3:47 PM Gavin Roy  wrote:
>
>>
>> On Tue, Jun 4, 2024 at 3:15 PM Ron Johnson 
>> wrote:
>>
>>>
>>> But why tar instead of custom? That was part of my original question.
>>>
>>
>> I've found it pretty useful for programmatically accessing data in a dump
>> for large databases outside of the normal pg_dump/pg_restore workflow. You
>> don't have to seek through one large binary file to get to the data section
>> to get at the data.
>>
>
> Interesting.  Please explain, though, since a big tarball _is_ "one large
> binary file" that you have to sequentially scan.  (I don't know the
> internal structure of custom format files, and whether they have file
> pointers to each table.)
>

Not if you untar it first.


> Is it because you need individual .dat "COPY" files for something other
> than loading into PG tables (since pg_restore --table= does that, too),
> and directory format archives can be inconvenient?
>

In the past I've used it for data analysis outside of Postgres.
-- 
*Gavin M. Roy*
CTO
AWeber


Re: Variant (Untyped) parameter for function/procedure

2024-06-05 Thread Adrian Klaver

On 6/5/24 01:34, Durumdara wrote:

Dear Members!

As I experienced, the functions/procedures extremely depend on 
parameters (very typed).
So if I have to restructure the input parameters, I can't modify the 
function, because I have to recreate the dependents too.

For example:
I have a type. If I pass this type to a function, I can't change the 
structure of the type without dropping and recreating the function.


create type blahtype as (a int, b bool);
create function blahcheck (input blahtype) ...


If I have many dependent functions this causes that I have to drop 
(recreate) everything - just for an extra parameter. And sometimes this 
extra parameter doesn't change 10 functions, only one. But I have to 
recreate them all (without changing the body).


Is there any way to use a Variant parameter?


You mean like:

https://www.postgresql.org/docs/current/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC

Examples here:

https://www.postgresql.org/docs/current/plpgsql-declarations.html#PLPGSQL-DECLARATION-PARAMETERS

Starting at:

"... When the return type of a PL/pgSQL function is declared as a 
polymorphic type ..."


Or there is VARIADIC:

https://www.postgresql.org/docs/current/xfunc-sql.html#XFUNC-SQL-VARIADIC-FUNCTIONS




Like this:

create procedure test(IN Input Record, OUT Output Record)
...
     Time = Input.Time::timestamp;
...

     Output = SomeHowMakeItTyped;

     Output.Result = 1;

...


Caller:

...
for r_in as select  id, name from blah into
...
      test(r_in, r_out);
      if r_out.result <> 0 then ...

Or:

create procedure test(IN Inputs Records, OUT Output Record)
...
         for Input in Inputs:
         Time = Input.Time::timestamp;
...
     Output.Result = 1;
...




Or is it impossible, because of the PGSQL's philosophy (very typed)?

If it is not possible then I have one way I think.
It is a JSON data type for inputs and outputs.

Is that right?

Thanks for your any help, info!

Best regards
dd







--
Adrian Klaver
adrian.kla...@aklaver.com





Length returns NULL ?

2024-06-05 Thread Marcos Pegoraro
There are some functions called ...length, but only array_length returns
NULL on empty array, why ?

select array_length('{}'::text[],1), -->NULL
   jsonb_array_length('[]'), -->0
   bit_length(''), -->0
   octet_length(''), -->0
   length(''), -->0
   char_length(''), -->0
   length(B''); -->0

I know, it is documented, but the question is, why does it work differently
?

array_length ( anyarray, integer ) → integer
Returns the length of the requested array dimension. (Produces NULL instead
of 0 for empty or missing array dimensions.)

array_length(array[1,2,3], 1) → 3
array_length(array[]::int[], 1) → NULL
array_length(array['text'], 2) → NULL


Re: Length returns NULL ?

2024-06-05 Thread Erik Wienhold
On 2024-06-05 19:50 +0200, Marcos Pegoraro wrote:
> There are some functions called ...length, but only array_length returns
> NULL on empty array, why ?

Because the empty array has zero dimensions[1].  But now I'm wondering
why array_ndims returns NULL instead of zero for empty arrays.

[1] 
https://www.postgresql.org/message-id/603c8f070811061951u16034c3fk5dfaa493a6739a24%40mail.gmail.com

-- 
Erik




Re: Long running query causing XID limit breach

2024-06-05 Thread yudhi s
On Wed, Jun 5, 2024 at 3:52 PM Laurenz Albe 
wrote:

>
> There should never be a restart unless you perform one or the standby
> crashes.
> If you mean that you want to avoid a crash caused by a full disk on the
> standby,
> the answer is probably "no".  Make sure that you have enough disk space and
> use monitoring.
>
> Yours,
> Laurenz Albe
>

Is this because OP initially mentioned its RDS postgres, so in that case
there is storage space restriction on 64TB(and 128TB in case of aurora
postgres). So I believe this storage space combines data + WAL , so in that
case as you mentioned, appropriate monitoring needs to be put in place.
Or else in the worst case scenario, if the storage consumption hit that
hard limit , then there will be instance restart or crash?


how to tell if a pg version supports a linux distribution

2024-06-05 Thread bruno vieira da silva
Hello, if a pg version has been tested on the buildfarm but the pg yum
repository doesn't have packages for a linux distribution that means that
distribution isn't supported by pg? how can I find if linux distributions
for a pg version have regression tests executed against.

e.g. : postgresql 16 doesn't have packages on yum for centos 7 but I can
find tests on the buildfarm for it.

https://buildfarm.postgresql.org/index.html

Thanks

-- 
Bruno Vieira da Silva


Re: how to tell if a pg version supports a linux distribution

2024-06-05 Thread Adrian Klaver




On 6/5/24 1:24 PM, bruno vieira da silva wrote:
Hello, if a pg version has been tested on the buildfarm but the pg yum 
repository doesn't have packages for a linux distribution that means 
that distribution isn't supported by pg? how can I find if linux 
distributions for a pg version have regression tests executed against.


e.g. : postgresql 16 doesn't have packages on yum for centos 7 but I can 
find tests on the buildfarm for it.


https://buildfarm.postgresql.org/index.html 




https://yum.postgresql.org/news/rhel7-postgresql-rpms-end-of-life/

" PostgreSQL RPM repo stopped adding new packages to the RHEL 7 repo as 
of Aug 2023, including PostgreSQL 16.


We will maintain older major releases until each major release is EOLed 
by PostgreSQL project. Please visit here for latest release dates for 
each major release.


If you have any questions, please either email to 
pgsql-pkg-...@lists.postgresql.org, or create a ticket at our redmine. "




Thanks

--
Bruno Vieira da Silva


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: how to tell if a pg version supports a linux distribution

2024-06-05 Thread Adrian Klaver




On 6/5/24 1:24 PM, bruno vieira da silva wrote:
Hello, if a pg version has been tested on the buildfarm but the pg yum 
repository doesn't have packages for a linux distribution that means 
that distribution isn't supported by pg? how can I find if linux 
distributions for a pg version have regression tests executed against.


e.g. : postgresql 16 doesn't have packages on yum for centos 7 but I can 
find tests on the buildfarm for it.


https://buildfarm.postgresql.org/index.html 



Should have added to previous post:

https://yum.postgresql.org/

Available PostgreSQL Releases

Click on Postgres version number to see what distro versions are supported.



Thanks

--
Bruno Vieira da Silva


--
Adrian Klaver
adrian.kla...@aklaver.com




Re: Questions on logical replication

2024-06-05 Thread Koen De Groote
>
> https://www.postgresql.org/docs/current/wal-configuration.html
>
> "Checkpoints are points in the sequence of transactions at which it is
> guaranteed that the heap and index data files have been updated with all
> information written before that checkpoint. At checkpoint time, all
> dirty data pages are flushed to disk and a special checkpoint record is
> written to the WAL file. (The change records were previously flushed to
> the WAL files.) In the event of a crash, the crash recovery procedure
> looks at the latest checkpoint record to determine the point in the WAL
> (known as the redo record) from which it should start the REDO
> operation. Any changes made to data files before that point are
> guaranteed to be already on disk. Hence, after a checkpoint, WAL
> segments preceding the one containing the redo record are no longer
> needed and can be recycled or removed. (When WAL archiving is being
> done, the WAL segments must be archived before being recycled or removed.)"
>

And this is the same for logical replication and physical replication, I
take it.

Thus, if a leader has a standby of the same version, and meanwhile logical
replication is being done to a newer version, both those replications are
taken into account, is that correct?


 When you set up logical replication you are 'asking' via the replication

slot that WAL records be kept on the publisher until the subscriber
>
retrieves them.
>

And if it cannot sync them, due to connectivity loss for instance, the WAL
records will not be removed, then?

Regards,
Koen De Groote


On Wed, Jun 5, 2024 at 1:05 AM Adrian Klaver 
wrote:

> On 6/4/24 15:55, Koen De Groote wrote:
> > I recently read the entire documentation on logical replication, but am
> > left with a question on the buildup of WAL
> >
> > On this page:
> >
> https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT
> <
> https://www.postgresql.org/docs/current/logical-replication-subscription.html#LOGICAL-REPLICATION-SUBSCRIPTION-SLOT
> >
> >
> > It is written: " When dropping a subscription, the remote host is not
> > reachable. In that case, disassociate the slot from the subscription
> > using |ALTER SUBSCRIPTION| before attempting to drop the subscription.
> > If the remote database instance no longer exists, no further action is
> > then necessary. If, however, the remote database instance is just
> > unreachable, the replication slot (and any still remaining table
> > synchronization slots) should then be dropped manually; otherwise
> > it/they would continue to reserve WAL and might eventually cause the
> > disk to fill up. Such cases should be carefully investigated."
> >
> >
> > Assuming a situation where I add tables 1 at a time to the publisher,
> > and refresh the subscription every time.
> >
> > What happens if I shut down the subscriber database for a while? The
> > subscription isn't dropped, so am I reading it right that the disk on
> > the publisher will slowly be filling up with WAL? Isn't that always the
> > case if wall is enabled?
>
> https://www.postgresql.org/docs/current/wal-configuration.html
>
> "Checkpoints are points in the sequence of transactions at which it is
> guaranteed that the heap and index data files have been updated with all
> information written before that checkpoint. At checkpoint time, all
> dirty data pages are flushed to disk and a special checkpoint record is
> written to the WAL file. (The change records were previously flushed to
> the WAL files.) In the event of a crash, the crash recovery procedure
> looks at the latest checkpoint record to determine the point in the WAL
> (known as the redo record) from which it should start the REDO
> operation. Any changes made to data files before that point are
> guaranteed to be already on disk. Hence, after a checkpoint, WAL
> segments preceding the one containing the redo record are no longer
> needed and can be recycled or removed. (When WAL archiving is being
> done, the WAL segments must be archived before being recycled or removed.)"
>
> >
> > This "cause disk to fill up" warning is quite concerning, and I'd like
> > to understand what could cause it and how likely it is? I thought
> > logical replication uses WAL by default, so doesn't that mean there has
> > to be a log of changes kept anyhow? Even if the WAL isn't written to
> > disk by an "archive_command"?
>
>
> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS
>
> "Replication slots provide an automated way to ensure that the primary
> does not remove WAL segments until they have been received by all
> standbys, and that the primary does not remove rows which could cause a
> recovery conflict even when the standby is disconnected."
>
> When you set up logical replication you are 'asking' via the replication
> slot that WAL records be kept on the publisher until the subscriber
> retrieves them.
>
> >
> > Regards,
> > Koen De Groote
>
>

Re: Questions on logical replication

2024-06-05 Thread Adrian Klaver

On 6/5/24 14:54, Koen De Groote wrote:

https://www.postgresql.org/docs/current/wal-configuration.html


"Checkpoints are points in the sequence of transactions at which it is
guaranteed that the heap and index data files have been updated with
all
information written before that checkpoint. At checkpoint time, all
dirty data pages are flushed to disk and a special checkpoint record is
written to the WAL file. (The change records were previously flushed to
the WAL files.) In the event of a crash, the crash recovery procedure
looks at the latest checkpoint record to determine the point in the WAL
(known as the redo record) from which it should start the REDO
operation. Any changes made to data files before that point are
guaranteed to be already on disk. Hence, after a checkpoint, WAL
segments preceding the one containing the redo record are no longer
needed and can be recycled or removed. (When WAL archiving is being
done, the WAL segments must be archived before being recycled or
removed.)"


And this is the same for logical replication and physical replication, I 
take it.


High level explanation, both physical and logical replication use the 
WAL files as the starting point. When the recycling is done is dependent 
on various factors. My suggestion would be to read through the below to 
get a better idea of what is going. There is a lot to cover, but if you 
really want to understand it you will need to go through it.


Physical replication

https://www.postgresql.org/docs/current/high-availability.html

27.2.5. Streaming Replication
27.2.6. Replication Slots

Logical replication

https://www.postgresql.org/docs/current/logical-replication.html

WAL

https://www.postgresql.org/docs/current/wal.html





Thus, if a leader has a standby of the same version, and meanwhile 
logical replication is being done to a newer version, both those 
replications are taken into account, is that correct?


Yes, see links above.


And if it cannot sync them, due to connectivity loss for instance, the 
WAL records will not be removed, then?


Depends on the type of replication being done. It is possible for 
physical replication to have WAL records removed that are still needed 
downstream.


From

https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION

"If you use streaming replication without file-based continuous 
archiving, the server might recycle old WAL segments before the standby 
has received them. If this occurs, the standby will need to be 
reinitialized from a new base backup. You can avoid this by setting 
wal_keep_size to a value large enough to ensure that WAL segments are 
not recycled too early, or by configuring a replication slot for the 
standby. If you set up a WAL archive that's accessible from the standby, 
these solutions are not required, since the standby can always use the 
archive to catch up provided it retains enough segments."


This is why it is good idea to go through the links I posted above.



Regards,
Koen De Groote




--
Adrian Klaver
adrian.kla...@aklaver.com