RE: Table : Bloat grow high

2022-11-13 Thread Alexis Zapata
Hi Laurenz ,
I found that xmin does not change when running the vacuum.




De: Laurenz Albe 
Enviado: sábado, 12 de noviembre de 2022 9:05 a. m.
Para: Alexis Zapata ; pgsql-general@lists.postgresql.org 

Asunto: Re: Table : Bloat grow high

On Fri, 2022-11-11 at 17:09 +, Alexis Zapata wrote:
> In postgresql 13.5 I have a table (size 3.1 GB) and in this table occurs near 
> to
> 200 updates per second, after 2 days the size table is 7 GB and bloat grow to 
> 45% and
>  the query operations are degraded. vacuum runs every 5 seconds over this. 
> but the
> bloat growth continues, to solve the
> problem quickly, we have made a replica of the table with a trigger, then a 
> copy
> of the data and in a
> transaction we rename the table, but it would not be the best solution.
> Some suggestion about stop this size increase or parameter to setting up?

You'd be most happy with HOT updates.  Make sure that there is no index on any 
of
the columns you update, and change the table to have a "fillfactor" less than
100.  Then you can get HOT updates which don't require VACUUM for cleaning up.

https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com


How to check stream replication latest history status

2022-11-13 Thread 徐志宇徐
Hi All

  I set up Postgres 11 stream replication env.Use keepalived to control
VIP and a script to check Postgres status.
  Usually. I could check the stream replication status by this view
"pg_stat_replcation" on primary server.
  Check this view "pg_stat_wal_receiver" on standby server.

  Sometime the primary server abnormal or the network interrupt by
unexpected issue.

  Both of them no record "pg_stat_replcation", "pg_stat_wal_receiver"
on that time.

  I want to know the stream replication gap between primary and standby
before standby promote.
  If the primary and standby have long replay_lag. I don't want to
standby server promote.

  Is there any view or function which store the latest stream
replication gap(replay_lag) between primary and standby ?
  Or it there any workaround to know this stream replication
gap(replay_lag) infomation ?  Thanks for your help.


Jack.Xu


Re: Upgrading to v12

2022-11-13 Thread Adrian Klaver

On 11/12/22 18:18, Brad White wrote:


>  How where the restored copies made on the original cluster?
I guess I'm not understanding the confusion here. They were restored
with the same script but to a different DB name and with the 9.4
executables.
In fact, that was why the script was originally written, so we could
restore and test the backups.


The confusion came from this:

"I only need the primary.
Not the half dozen restored copies."

I initially assumed, correctly as it turns out, that they all existed on 
the production cluster and where duplicates.


Then you posted:

"Over time, we've restored multiple copies for testing and reproducing 
various issues.


I'm only trying to set up replication one one of those copies. "

and showed a process where they being restored to other clusters.

At that point I was lost as to what copies meant and where they came from.

Not sure that this is actually pertinent to the problem at hand, I was 
just trying to nail down the moving pieces.






I've since hijacked it and used it to restore to other versions.



--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Upgrading to v12

2022-11-13 Thread Adrian Klaver

On 11/12/22 22:07, Tom Lane wrote:

Ron  writes:

On 11/11/22 23:09, Adrian Klaver wrote:

2) For your explanation above, pg_dump from 9.4(5432) to pg_restore
12(5433) the issue would be ...\9.4\bin\pg_dump.exe of 9.4 and pg_restore
of said dump file to version 12. When moving up in version you need to use
the newer version of pg_dump(...\12\bin\pg_dump.exe) to dump the 9.4
instance and then the version 12 pg_restore to the 12 instance. Both
programs are backwards compatible, not forwards compatible.



Unless there's some bug (you're running a /really/ old version of 9.4), you
might be able to get away with using the 9.4 binary.


Yeah.  The recommendation to use the later version's pg_dump for a
migration is in the nature of "this is best practice", not "this is
the only way that will work".  The argument for it is that the older
pg_dump might have bugs that are fixed in the newer version.  But
such bugs are rare, so usually it'll work fine to use the older one.
We do endeavor to make sure that older dump output will load into
newer versions, because in disaster-recovery scenarios an older
dump might be all you have.


I stand corrected.

I should have read the Notes here:

https://www.postgresql.org/docs/current/app-pgdump.html



regards, tom lane




--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Table : Bloat grow high

2022-11-13 Thread Laurenz Albe
On Sun, 2022-11-13 at 14:50 +, Alexis Zapata wrote:
> I found that xmin does not change when running the vacuum.

Which xmin?

Yours,
Laurenz Albe




ON CONFLICT and WHERE

2022-11-13 Thread Adrian Klaver

In process of answering an SO question I ran across the below.

The original question example:

CREATE TABLE books (
id int4 NOT NULL,
version int8 NOT NULL,
updated timestamp NULL,
CONSTRAINT books_pkey PRIMARY KEY (id)
);

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12;


INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12;


With select results as:

 id | version |  updated   |   current_timestamp 
| ?column?

+-+++--
 12 |   0 | 11/13/2022 12:21:38.032578 | 11/13/2022 12:21:38.057545 
PST | f



 id | version |  updated   |   current_timestamp 
| ?column?

+-+++--
 12 |   1 | 11/13/2022 12:21:38.058673 | 11/13/2022 12:21:40.686231 
PST | f



I have not used WHERE with ON CONFLICT myself so it took longer then I 
care to admit to correct the above to:


DROP TABLE IF EXISTS books;

CREATE TABLE books (
id int4 NOT NULL,
version int8 NOT NULL,
updated timestamp NULL,
CONSTRAINT books_pkey PRIMARY KEY (id)
);

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP;


select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12;


INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP;


select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < 
CURRENT_TIMESTAMP from books where id = 12



With select results as:

 id | version |  updated   |   current_timestamp 
| ?column?

+-+++--
 12 |   0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.463705 
PST | f



id | version |  updated   |   current_timestamp 
  | ?column?

+-+++--
 12 |   0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.476484 
PST | f




I ran this on both version 14 and 15 with same results.

The question is why did the first case just ignore the WHERE instead of 
throwing a syntax error?



--
Adrian Klaver
adrian.kla...@aklaver.com




Q: fixing collation version mismatches

2022-11-13 Thread Karsten Hilbert
Dear all,

just to confirm my understanding:

Is it correct to say that the following sequence will "fix"
all current collation version issues in a given database ?

REINDEX DATABASE db_in_question;
ALTER DATABASE db_in_question REFRESH COLLATION VERSION;
ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION;

Note that I am currently _not_ concerned with minimizing
work by running this on objects only that really need a
reindex/refresh.

Thanks,
Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Q: fixing collation version mismatches

2022-11-13 Thread Christophe Pettus



> On Nov 13, 2022, at 12:45, Karsten Hilbert  wrote:
>   REINDEX DATABASE db_in_question;
>   ALTER DATABASE db_in_question REFRESH COLLATION VERSION;
>   ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION;

I may be totally off-base here, but shouldn't the REINDEX be the last step?




Re: Setting up replication on Windows, v9.4

2022-11-13 Thread Ron
Note that WAL replication replicates *the whole instance* not just specific 
databases.  You need logical replication for that.


Also, I just learned that 9.4 *does* have WAL replication slots (which makes 
replication *much* easier).  v9.6 is where replication is "like rolling off 
a log" simple.


On 11/4/22 17:59, Brad White wrote:
I'm setting up a backup for our primary postgres server using the archived 
WAL files.

Then I'll try to upgrade it to Streaming Replication.
Then I'll upgrade the system to v.latest.
For now, we are on v.9.4.

I do a base backup from the primary to a directory on the NAS.

      "C:\Program Files\PostgreSQL\9.4\bin\pg_basebackup.exe" -D 
\\diskstation\AccessData\Dev\Backup -P -X s -v -h 192.168.1.118 -p 5432 -U 
postgres


That appears to go fine.
Then I delete data\*.* and copy everything except the config files from 
the backup into data.


Copy in recovery.conf
--
    standby_mode          = 'on'
    primary_conninfo      = 'host=192.168.1.118 port=5432 user=replication 
password=**'

    restore_command = 'copy "DISKSTATION\\AccessData\\WALfiles\\%f" "%p"'
--
Copy in postgresql.conf, with settings
--
listen_addresses = '127.0.0.1,192.168.1.118'
wal_level = archive
hot_standby = on
--
Interestingly, the recovery file says
# Note that recovery.conf must be in $PGDATA directory.
# It should NOT be located in the same directory as postgresql.conf
Those seem contradictory.
And if I remove the postgresql, it just refuses to start.

With all this in place, I start the service, it runs for a bit, then shuts 
down.

No errors in the event log.
5 postgres processes are left running along with a pid file.

The log file says
--
LOG:  database system was interrupted while in recovery at log time 
2022-11-04 13:17:28 PDT
HINT:  If this has occurred more than once some data might be corrupted 
and you might need to choose an earlier recovery target.

FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
LOG:  entering standby mode
LOG:  consistent recovery state reached at 6A/3590
LOG:  record with zero length at 6A/3590
LOG:  started streaming WAL from primary at 6A/3500 on timeline 1
LOG:  redo starts at 6A/3590
FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up

FATAL:  the database system is starting up
FATAL:  the database system is starting up

So it seems that the backup didn't work as well as first appeared.


--
Angular momentum makes the world go 'round.

Re: ON CONFLICT and WHERE

2022-11-13 Thread Tom Lane
Adrian Klaver  writes:
> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> ON CONFLICT (id)
> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
> DO UPDATE
> SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

> I have not used WHERE with ON CONFLICT myself so it took longer then I 
> care to admit to correct the above to:

> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> ON CONFLICT (id)
> DO UPDATE
> SET version = books.version + 1, updated = CURRENT_TIMESTAMP
> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < 
> CURRENT_TIMESTAMP;

> The question is why did the first case just ignore the WHERE instead of 
> throwing a syntax error?

A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause.  It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes.  TFM explains

index_predicate

Used to allow inference of partial unique indexes. Any indexes
that satisfy the predicate (which need not actually be partial
indexes) can be inferred.

This strikes me as a bit of a foot-gun.  I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here.  (This documentation text is about as clear as
mud, too.  What does "inferred" mean here?  I think it means "chosen as
arbiter index", but maybe I misunderstand.)

regards, tom lane




Re: ON CONFLICT and WHERE

2022-11-13 Thread Adrian Klaver

On 11/13/22 13:07, Tom Lane wrote:

Adrian Klaver  writes:

INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;



I have not used WHERE with ON CONFLICT myself so it took longer then I
care to admit to correct the above to:



INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP
WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
CURRENT_TIMESTAMP;



The question is why did the first case just ignore the WHERE instead of
throwing a syntax error?


A WHERE placed there is an index_predicate attachment to the ON CONFLICT
clause.  It doesn't have any run-time effect other than to allow partial
indexes to be chosen as arbiter indexes.  TFM explains

 index_predicate

 Used to allow inference of partial unique indexes. Any indexes
 that satisfy the predicate (which need not actually be partial
 indexes) can be inferred.

This strikes me as a bit of a foot-gun.  I wonder if we should make
it safer by insisting that the resolved index be partial when there's
a WHERE clause here.  (This documentation text is about as clear as
mud, too.  What does "inferred" mean here?  I think it means "chosen as
arbiter index", but maybe I misunderstand.)


Alright I see how another use of WHERE comes into play.

I do agree with the clarity of the description, especially after looking 
at the example:


"
Insert new distributor if possible; otherwise DO NOTHING. Example 
assumes a unique index has been defined that constrains values appearing 
in the did column on a subset of rows where the is_active Boolean column 
evaluates to true:


-- This statement could infer a partial unique index on "did"
-- with a predicate of "WHERE is_active", but it could also
-- just use a regular unique constraint on "did"
INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
ON CONFLICT (did) WHERE is_active DO NOTHING;
"

I honestly cannot figure out what that is saying.



regards, tom lane


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: ON CONFLICT and WHERE

2022-11-13 Thread Peter Geoghegan
On Sun, Nov 13, 2022 at 1:07 PM Tom Lane  wrote:
> A WHERE placed there is an index_predicate attachment to the ON CONFLICT
> clause.  It doesn't have any run-time effect other than to allow partial
> indexes to be chosen as arbiter indexes.  TFM explains
>
> index_predicate
>
> Used to allow inference of partial unique indexes. Any indexes
> that satisfy the predicate (which need not actually be partial
> indexes) can be inferred.
>
> This strikes me as a bit of a foot-gun.  I wonder if we should make
> it safer by insisting that the resolved index be partial when there's
> a WHERE clause here.

I don't think that it would be safer.

Adrian has asked why it's possible to attach an arbitrary
index_predicate type WHERE clause to an ON CONFLICT query, without
that really changing the behavior of the statement. That *is* a little
odd, so it's certainly a fair question (I can recall perhaps as many
as 5 similar questions over the years). But it's not the end of the
world, either -- there are far worse things.

I think that it would be a lot worse (just for example) to have your
ON CONFLICT query suddenly start throwing an ERROR in production, just
because you replaced a partial unique index with a unique constraint.
If we have a suitable unique index or constraint, why wouldn't we use
it in ON CONFLICT? Maybe it won't work out that way (maybe there won't
be any suitable unique index or constraint), but why not do our utmost
to insulate the user from what might be a serious production issue?
That was the guiding principle.

Overall I'm quite happy with the amount of foot-guns ON CONFLICT has,
especially compared to other comparable features in other DB systems
(which had plenty). There are one or two ostensibly odd things about
the syntax that are downstream consequences of trying to make the
constraint/unique index inference process maximally forgiving. I'm
pretty happy with that trade-off.

> (This documentation text is about as clear as
> mud, too.  What does "inferred" mean here?  I think it means "chosen as
> arbiter index", but maybe I misunderstand.)

Unique index/constraint inference is the process by which we choose an
arbiter index. See the second paragraph of the "ON CONFLICT Clause"
section of the INSERT docs.

-- 
Peter Geoghegan




Re: Q: fixing collation version mismatches

2022-11-13 Thread Karsten Hilbert
Am Sun, Nov 13, 2022 at 12:46:53PM -0800 schrieb Christophe Pettus:

> > On Nov 13, 2022, at 12:45, Karsten Hilbert  wrote:
> > REINDEX DATABASE db_in_question;
> > ALTER DATABASE db_in_question REFRESH COLLATION VERSION;
> > ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION;
>
> I may be totally off-base here, but shouldn't the REINDEX be the last step?

To my understanding, the REFRESH statements "merely" update
the version information stored in the related objects. They
do not change anything else; and the REINDEX does not
reference them in any way.

I suppose the REINDEX goes first as it does the actual fixing
of now-invalid objects by rebuilding them. After that one is
back to a usable database state, even if left with pesky
(albeit harmless) warnings on version mismatches -- which to
get rid of one runs the REFRESH statements.

Or so my understanding...

Which is why my question still stands: does the above
three-strikes operation safely take care of any collation
issues that may currently exist in a database ?

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B




Re: Q: fixing collation version mismatches

2022-11-13 Thread Julien Rouhaud
Le lun. 14 nov. 2022 à 05:58, Karsten Hilbert  a
écrit :

> Am Sun, Nov 13, 2022 at 12:46:53PM -0800 schrieb Christophe Pettus:
>
> > > On Nov 13, 2022, at 12:45, Karsten Hilbert 
> wrote:
> > > REINDEX DATABASE db_in_question;
> > > ALTER DATABASE db_in_question REFRESH COLLATION VERSION;
> > > ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION;
> >
> > I may be totally off-base here, but shouldn't the REINDEX be the last
> step?
>
> To my understanding, the REFRESH statements "merely" update
> the version information stored in the related objects. They
> do not change anything else; and the REINDEX does not
> reference them in any way.
>
> I suppose the REINDEX goes first as it does the actual fixing
> of now-invalid objects by rebuilding them. After that one is
> back to a usable database state, even if left with pesky
> (albeit harmless) warnings on version mismatches -- which to
> get rid of one runs the REFRESH statements.
>
> Or so my understanding...
>

yes exactly. but it's likely that people will have some form of automation
to run the reindex if there's any discrepancy between the recorded
collation version and recorded version, so if you first fix the versions
metada and then encounter any error during the reindex, you won't know if
you need to reindex or not and might end up with corrupt indexes.

>
> Which is why my question still stands: does the above
> three-strikes operation safely take care of any collation
> issues that may currently exist in a database ?
>

yes

>


Re: Q: fixing collation version mismatches

2022-11-13 Thread Julien Rouhaud
Le lun. 14 nov. 2022 à 13:10, Julien Rouhaud  a écrit :

> yes exactly. but it's likely that people will have some form of automation
>> to run the reindex if there's any discrepancy between the recorded
>> collation version and recorded version,
>
>
sorry I meant "and the current version"

>


Re: ON CONFLICT and WHERE

2022-11-13 Thread jian he
On Mon, Nov 14, 2022 at 2:55 AM Adrian Klaver 
wrote:

> On 11/13/22 13:07, Tom Lane wrote:
> > Adrian Klaver  writes:
> >> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> >> ON CONFLICT (id)
> >> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
> >> DO UPDATE
> >> SET version = books.version + 1, updated = CURRENT_TIMESTAMP;
> >
> >> I have not used WHERE with ON CONFLICT myself so it took longer then I
> >> care to admit to correct the above to:
> >
> >> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> >> ON CONFLICT (id)
> >> DO UPDATE
> >> SET version = books.version + 1, updated = CURRENT_TIMESTAMP
> >> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
> >> CURRENT_TIMESTAMP;
> >
> >> The question is why did the first case just ignore the WHERE instead of
> >> throwing a syntax error?
> >
> > A WHERE placed there is an index_predicate attachment to the ON CONFLICT
> > clause.  It doesn't have any run-time effect other than to allow partial
> > indexes to be chosen as arbiter indexes.  TFM explains
> >
> >  index_predicate
> >
> >  Used to allow inference of partial unique indexes. Any indexes
> >  that satisfy the predicate (which need not actually be partial
> >  indexes) can be inferred.
> >
> > This strikes me as a bit of a foot-gun.  I wonder if we should make
> > it safer by insisting that the resolved index be partial when there's
> > a WHERE clause here.  (This documentation text is about as clear as
> > mud, too.  What does "inferred" mean here?  I think it means "chosen as
> > arbiter index", but maybe I misunderstand.)
>
> Alright I see how another use of WHERE comes into play.
>
> I do agree with the clarity of the description, especially after looking
> at the example:
>
> "
> Insert new distributor if possible; otherwise DO NOTHING. Example
> assumes a unique index has been defined that constrains values appearing
> in the did column on a subset of rows where the is_active Boolean column
> evaluates to true:
>
> -- This statement could infer a partial unique index on "did"
> -- with a predicate of "WHERE is_active", but it could also
> -- just use a regular unique constraint on "did"
> INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
>  ON CONFLICT (did) WHERE is_active DO NOTHING;
> "
>
> I honestly cannot figure out what that is saying.
>
> >
> >   regards, tom lane
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
>
INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

Since id is already the primary key, it skipped the WHERE part. it resolves
to the DO UPDATE part.

from test code.

> create table insertconflicttest(key int4, fruit text);
> create unique index partial_key_index on insertconflicttest(key) where
> fruit like '%berry';
>

In this case, the  on conflict clause should be exactly like *on conflict
(key) where fruit like '%berry'*

-- fails
insert into insertconflicttest values (23, 'Blackberry') on conflict (key)
do update set fruit = excluded.fruit;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
specification
insert into insertconflicttest values (23, 'Blackberry') on conflict (key)
where fruit like '%berry' or fruit = 'consequential' do nothing;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
specification
insert into insertconflicttest values (23, 'Blackberry') on conflict
(fruit) where fruit like '%berry' do update set fruit = excluded.fruit;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
specification


-- 
 I recommend David Deutsch's <>

  Jian