RE: Table : Bloat grow high
Hi Laurenz , I found that xmin does not change when running the vacuum. De: Laurenz Albe Enviado: sábado, 12 de noviembre de 2022 9:05 a. m. Para: Alexis Zapata ; pgsql-general@lists.postgresql.org Asunto: Re: Table : Bloat grow high On Fri, 2022-11-11 at 17:09 +, Alexis Zapata wrote: > In postgresql 13.5 I have a table (size 3.1 GB) and in this table occurs near > to > 200 updates per second, after 2 days the size table is 7 GB and bloat grow to > 45% and > the query operations are degraded. vacuum runs every 5 seconds over this. > but the > bloat growth continues, to solve the > problem quickly, we have made a replica of the table with a trigger, then a > copy > of the data and in a > transaction we rename the table, but it would not be the best solution. > Some suggestion about stop this size increase or parameter to setting up? You'd be most happy with HOT updates. Make sure that there is no index on any of the columns you update, and change the table to have a "fillfactor" less than 100. Then you can get HOT updates which don't require VACUUM for cleaning up. https://www.cybertec-postgresql.com/en/hot-updates-in-postgresql-for-better-performance/ Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
How to check stream replication latest history status
Hi All I set up Postgres 11 stream replication env.Use keepalived to control VIP and a script to check Postgres status. Usually. I could check the stream replication status by this view "pg_stat_replcation" on primary server. Check this view "pg_stat_wal_receiver" on standby server. Sometime the primary server abnormal or the network interrupt by unexpected issue. Both of them no record "pg_stat_replcation", "pg_stat_wal_receiver" on that time. I want to know the stream replication gap between primary and standby before standby promote. If the primary and standby have long replay_lag. I don't want to standby server promote. Is there any view or function which store the latest stream replication gap(replay_lag) between primary and standby ? Or it there any workaround to know this stream replication gap(replay_lag) infomation ? Thanks for your help. Jack.Xu
Re: Upgrading to v12
On 11/12/22 18:18, Brad White wrote: > How where the restored copies made on the original cluster? I guess I'm not understanding the confusion here. They were restored with the same script but to a different DB name and with the 9.4 executables. In fact, that was why the script was originally written, so we could restore and test the backups. The confusion came from this: "I only need the primary. Not the half dozen restored copies." I initially assumed, correctly as it turns out, that they all existed on the production cluster and where duplicates. Then you posted: "Over time, we've restored multiple copies for testing and reproducing various issues. I'm only trying to set up replication one one of those copies. " and showed a process where they being restored to other clusters. At that point I was lost as to what copies meant and where they came from. Not sure that this is actually pertinent to the problem at hand, I was just trying to nail down the moving pieces. I've since hijacked it and used it to restore to other versions. -- Adrian Klaver adrian.kla...@aklaver.com
Re: Upgrading to v12
On 11/12/22 22:07, Tom Lane wrote: Ron writes: On 11/11/22 23:09, Adrian Klaver wrote: 2) For your explanation above, pg_dump from 9.4(5432) to pg_restore 12(5433) the issue would be ...\9.4\bin\pg_dump.exe of 9.4 and pg_restore of said dump file to version 12. When moving up in version you need to use the newer version of pg_dump(...\12\bin\pg_dump.exe) to dump the 9.4 instance and then the version 12 pg_restore to the 12 instance. Both programs are backwards compatible, not forwards compatible. Unless there's some bug (you're running a /really/ old version of 9.4), you might be able to get away with using the 9.4 binary. Yeah. The recommendation to use the later version's pg_dump for a migration is in the nature of "this is best practice", not "this is the only way that will work". The argument for it is that the older pg_dump might have bugs that are fixed in the newer version. But such bugs are rare, so usually it'll work fine to use the older one. We do endeavor to make sure that older dump output will load into newer versions, because in disaster-recovery scenarios an older dump might be all you have. I stand corrected. I should have read the Notes here: https://www.postgresql.org/docs/current/app-pgdump.html regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com
Re: Table : Bloat grow high
On Sun, 2022-11-13 at 14:50 +, Alexis Zapata wrote: > I found that xmin does not change when running the vacuum. Which xmin? Yours, Laurenz Albe
ON CONFLICT and WHERE
In process of answering an SO question I ran across the below. The original question example: CREATE TABLE books ( id int4 NOT NULL, version int8 NOT NULL, updated timestamp NULL, CONSTRAINT books_pkey PRIMARY KEY (id) ); INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP; select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < CURRENT_TIMESTAMP from books where id = 12; INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP; select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < CURRENT_TIMESTAMP from books where id = 12; With select results as: id | version | updated | current_timestamp | ?column? +-+++-- 12 | 0 | 11/13/2022 12:21:38.032578 | 11/13/2022 12:21:38.057545 PST | f id | version | updated | current_timestamp | ?column? +-+++-- 12 | 1 | 11/13/2022 12:21:38.058673 | 11/13/2022 12:21:40.686231 PST | f I have not used WHERE with ON CONFLICT myself so it took longer then I care to admit to correct the above to: DROP TABLE IF EXISTS books; CREATE TABLE books ( id int4 NOT NULL, version int8 NOT NULL, updated timestamp NULL, CONSTRAINT books_pkey PRIMARY KEY (id) ); INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < CURRENT_TIMESTAMP; select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < CURRENT_TIMESTAMP from books where id = 12; INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < CURRENT_TIMESTAMP; select *, CURRENT_TIMESTAMP, updated + INTERVAL '2min' < CURRENT_TIMESTAMP from books where id = 12 With select results as: id | version | updated | current_timestamp | ?column? +-+++-- 12 | 0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.463705 PST | f id | version | updated | current_timestamp | ?column? +-+++-- 12 | 0 | 11/13/2022 12:32:01.427769 | 11/13/2022 12:32:01.476484 PST | f I ran this on both version 14 and 15 with same results. The question is why did the first case just ignore the WHERE instead of throwing a syntax error? -- Adrian Klaver adrian.kla...@aklaver.com
Q: fixing collation version mismatches
Dear all, just to confirm my understanding: Is it correct to say that the following sequence will "fix" all current collation version issues in a given database ? REINDEX DATABASE db_in_question; ALTER DATABASE db_in_question REFRESH COLLATION VERSION; ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION; Note that I am currently _not_ concerned with minimizing work by running this on objects only that really need a reindex/refresh. Thanks, Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: Q: fixing collation version mismatches
> On Nov 13, 2022, at 12:45, Karsten Hilbert wrote: > REINDEX DATABASE db_in_question; > ALTER DATABASE db_in_question REFRESH COLLATION VERSION; > ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION; I may be totally off-base here, but shouldn't the REINDEX be the last step?
Re: Setting up replication on Windows, v9.4
Note that WAL replication replicates *the whole instance* not just specific databases. You need logical replication for that. Also, I just learned that 9.4 *does* have WAL replication slots (which makes replication *much* easier). v9.6 is where replication is "like rolling off a log" simple. On 11/4/22 17:59, Brad White wrote: I'm setting up a backup for our primary postgres server using the archived WAL files. Then I'll try to upgrade it to Streaming Replication. Then I'll upgrade the system to v.latest. For now, we are on v.9.4. I do a base backup from the primary to a directory on the NAS. "C:\Program Files\PostgreSQL\9.4\bin\pg_basebackup.exe" -D \\diskstation\AccessData\Dev\Backup -P -X s -v -h 192.168.1.118 -p 5432 -U postgres That appears to go fine. Then I delete data\*.* and copy everything except the config files from the backup into data. Copy in recovery.conf -- standby_mode = 'on' primary_conninfo = 'host=192.168.1.118 port=5432 user=replication password=**' restore_command = 'copy "DISKSTATION\\AccessData\\WALfiles\\%f" "%p"' -- Copy in postgresql.conf, with settings -- listen_addresses = '127.0.0.1,192.168.1.118' wal_level = archive hot_standby = on -- Interestingly, the recovery file says # Note that recovery.conf must be in $PGDATA directory. # It should NOT be located in the same directory as postgresql.conf Those seem contradictory. And if I remove the postgresql, it just refuses to start. With all this in place, I start the service, it runs for a bit, then shuts down. No errors in the event log. 5 postgres processes are left running along with a pid file. The log file says -- LOG: database system was interrupted while in recovery at log time 2022-11-04 13:17:28 PDT HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target. FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up LOG: entering standby mode LOG: consistent recovery state reached at 6A/3590 LOG: record with zero length at 6A/3590 LOG: started streaming WAL from primary at 6A/3500 on timeline 1 LOG: redo starts at 6A/3590 FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up FATAL: the database system is starting up So it seems that the backup didn't work as well as first appeared. -- Angular momentum makes the world go 'round.
Re: ON CONFLICT and WHERE
Adrian Klaver writes: > INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) > ON CONFLICT (id) > WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP > DO UPDATE > SET version = books.version + 1, updated = CURRENT_TIMESTAMP; > I have not used WHERE with ON CONFLICT myself so it took longer then I > care to admit to correct the above to: > INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) > ON CONFLICT (id) > DO UPDATE > SET version = books.version + 1, updated = CURRENT_TIMESTAMP > WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < > CURRENT_TIMESTAMP; > The question is why did the first case just ignore the WHERE instead of > throwing a syntax error? A WHERE placed there is an index_predicate attachment to the ON CONFLICT clause. It doesn't have any run-time effect other than to allow partial indexes to be chosen as arbiter indexes. TFM explains index_predicate Used to allow inference of partial unique indexes. Any indexes that satisfy the predicate (which need not actually be partial indexes) can be inferred. This strikes me as a bit of a foot-gun. I wonder if we should make it safer by insisting that the resolved index be partial when there's a WHERE clause here. (This documentation text is about as clear as mud, too. What does "inferred" mean here? I think it means "chosen as arbiter index", but maybe I misunderstand.) regards, tom lane
Re: ON CONFLICT and WHERE
On 11/13/22 13:07, Tom Lane wrote: Adrian Klaver writes: INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP; I have not used WHERE with ON CONFLICT myself so it took longer then I care to admit to correct the above to: INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < CURRENT_TIMESTAMP; The question is why did the first case just ignore the WHERE instead of throwing a syntax error? A WHERE placed there is an index_predicate attachment to the ON CONFLICT clause. It doesn't have any run-time effect other than to allow partial indexes to be chosen as arbiter indexes. TFM explains index_predicate Used to allow inference of partial unique indexes. Any indexes that satisfy the predicate (which need not actually be partial indexes) can be inferred. This strikes me as a bit of a foot-gun. I wonder if we should make it safer by insisting that the resolved index be partial when there's a WHERE clause here. (This documentation text is about as clear as mud, too. What does "inferred" mean here? I think it means "chosen as arbiter index", but maybe I misunderstand.) Alright I see how another use of WHERE comes into play. I do agree with the clarity of the description, especially after looking at the example: " Insert new distributor if possible; otherwise DO NOTHING. Example assumes a unique index has been defined that constrains values appearing in the did column on a subset of rows where the is_active Boolean column evaluates to true: -- This statement could infer a partial unique index on "did" -- with a predicate of "WHERE is_active", but it could also -- just use a regular unique constraint on "did" INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International') ON CONFLICT (did) WHERE is_active DO NOTHING; " I honestly cannot figure out what that is saying. regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com
Re: ON CONFLICT and WHERE
On Sun, Nov 13, 2022 at 1:07 PM Tom Lane wrote: > A WHERE placed there is an index_predicate attachment to the ON CONFLICT > clause. It doesn't have any run-time effect other than to allow partial > indexes to be chosen as arbiter indexes. TFM explains > > index_predicate > > Used to allow inference of partial unique indexes. Any indexes > that satisfy the predicate (which need not actually be partial > indexes) can be inferred. > > This strikes me as a bit of a foot-gun. I wonder if we should make > it safer by insisting that the resolved index be partial when there's > a WHERE clause here. I don't think that it would be safer. Adrian has asked why it's possible to attach an arbitrary index_predicate type WHERE clause to an ON CONFLICT query, without that really changing the behavior of the statement. That *is* a little odd, so it's certainly a fair question (I can recall perhaps as many as 5 similar questions over the years). But it's not the end of the world, either -- there are far worse things. I think that it would be a lot worse (just for example) to have your ON CONFLICT query suddenly start throwing an ERROR in production, just because you replaced a partial unique index with a unique constraint. If we have a suitable unique index or constraint, why wouldn't we use it in ON CONFLICT? Maybe it won't work out that way (maybe there won't be any suitable unique index or constraint), but why not do our utmost to insulate the user from what might be a serious production issue? That was the guiding principle. Overall I'm quite happy with the amount of foot-guns ON CONFLICT has, especially compared to other comparable features in other DB systems (which had plenty). There are one or two ostensibly odd things about the syntax that are downstream consequences of trying to make the constraint/unique index inference process maximally forgiving. I'm pretty happy with that trade-off. > (This documentation text is about as clear as > mud, too. What does "inferred" mean here? I think it means "chosen as > arbiter index", but maybe I misunderstand.) Unique index/constraint inference is the process by which we choose an arbiter index. See the second paragraph of the "ON CONFLICT Clause" section of the INSERT docs. -- Peter Geoghegan
Re: Q: fixing collation version mismatches
Am Sun, Nov 13, 2022 at 12:46:53PM -0800 schrieb Christophe Pettus: > > On Nov 13, 2022, at 12:45, Karsten Hilbert wrote: > > REINDEX DATABASE db_in_question; > > ALTER DATABASE db_in_question REFRESH COLLATION VERSION; > > ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION; > > I may be totally off-base here, but shouldn't the REINDEX be the last step? To my understanding, the REFRESH statements "merely" update the version information stored in the related objects. They do not change anything else; and the REINDEX does not reference them in any way. I suppose the REINDEX goes first as it does the actual fixing of now-invalid objects by rebuilding them. After that one is back to a usable database state, even if left with pesky (albeit harmless) warnings on version mismatches -- which to get rid of one runs the REFRESH statements. Or so my understanding... Which is why my question still stands: does the above three-strikes operation safely take care of any collation issues that may currently exist in a database ? Karsten -- GPG 40BE 5B0E C98E 1713 AFA6 5BC0 3BEA AC80 7D4F C89B
Re: Q: fixing collation version mismatches
Le lun. 14 nov. 2022 à 05:58, Karsten Hilbert a écrit : > Am Sun, Nov 13, 2022 at 12:46:53PM -0800 schrieb Christophe Pettus: > > > > On Nov 13, 2022, at 12:45, Karsten Hilbert > wrote: > > > REINDEX DATABASE db_in_question; > > > ALTER DATABASE db_in_question REFRESH COLLATION VERSION; > > > ALTER COLLATION every_collation_from_pg_collation REFRESH VERSION; > > > > I may be totally off-base here, but shouldn't the REINDEX be the last > step? > > To my understanding, the REFRESH statements "merely" update > the version information stored in the related objects. They > do not change anything else; and the REINDEX does not > reference them in any way. > > I suppose the REINDEX goes first as it does the actual fixing > of now-invalid objects by rebuilding them. After that one is > back to a usable database state, even if left with pesky > (albeit harmless) warnings on version mismatches -- which to > get rid of one runs the REFRESH statements. > > Or so my understanding... > yes exactly. but it's likely that people will have some form of automation to run the reindex if there's any discrepancy between the recorded collation version and recorded version, so if you first fix the versions metada and then encounter any error during the reindex, you won't know if you need to reindex or not and might end up with corrupt indexes. > > Which is why my question still stands: does the above > three-strikes operation safely take care of any collation > issues that may currently exist in a database ? > yes >
Re: Q: fixing collation version mismatches
Le lun. 14 nov. 2022 à 13:10, Julien Rouhaud a écrit : > yes exactly. but it's likely that people will have some form of automation >> to run the reindex if there's any discrepancy between the recorded >> collation version and recorded version, > > sorry I meant "and the current version" >
Re: ON CONFLICT and WHERE
On Mon, Nov 14, 2022 at 2:55 AM Adrian Klaver wrote: > On 11/13/22 13:07, Tom Lane wrote: > > Adrian Klaver writes: > >> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) > >> ON CONFLICT (id) > >> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP > >> DO UPDATE > >> SET version = books.version + 1, updated = CURRENT_TIMESTAMP; > > > >> I have not used WHERE with ON CONFLICT myself so it took longer then I > >> care to admit to correct the above to: > > > >> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) > >> ON CONFLICT (id) > >> DO UPDATE > >> SET version = books.version + 1, updated = CURRENT_TIMESTAMP > >> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' < > >> CURRENT_TIMESTAMP; > > > >> The question is why did the first case just ignore the WHERE instead of > >> throwing a syntax error? > > > > A WHERE placed there is an index_predicate attachment to the ON CONFLICT > > clause. It doesn't have any run-time effect other than to allow partial > > indexes to be chosen as arbiter indexes. TFM explains > > > > index_predicate > > > > Used to allow inference of partial unique indexes. Any indexes > > that satisfy the predicate (which need not actually be partial > > indexes) can be inferred. > > > > This strikes me as a bit of a foot-gun. I wonder if we should make > > it safer by insisting that the resolved index be partial when there's > > a WHERE clause here. (This documentation text is about as clear as > > mud, too. What does "inferred" mean here? I think it means "chosen as > > arbiter index", but maybe I misunderstand.) > > Alright I see how another use of WHERE comes into play. > > I do agree with the clarity of the description, especially after looking > at the example: > > " > Insert new distributor if possible; otherwise DO NOTHING. Example > assumes a unique index has been defined that constrains values appearing > in the did column on a subset of rows where the is_active Boolean column > evaluates to true: > > -- This statement could infer a partial unique index on "did" > -- with a predicate of "WHERE is_active", but it could also > -- just use a regular unique constraint on "did" > INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International') > ON CONFLICT (did) WHERE is_active DO NOTHING; > " > > I honestly cannot figure out what that is saying. > > > > > regards, tom lane > > -- > Adrian Klaver > adrian.kla...@aklaver.com > > > > INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP) ON CONFLICT (id) WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP DO UPDATE SET version = books.version + 1, updated = CURRENT_TIMESTAMP; Since id is already the primary key, it skipped the WHERE part. it resolves to the DO UPDATE part. from test code. > create table insertconflicttest(key int4, fruit text); > create unique index partial_key_index on insertconflicttest(key) where > fruit like '%berry'; > In this case, the on conflict clause should be exactly like *on conflict (key) where fruit like '%berry'* -- fails insert into insertconflicttest values (23, 'Blackberry') on conflict (key) do update set fruit = excluded.fruit; ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification insert into insertconflicttest values (23, 'Blackberry') on conflict (key) where fruit like '%berry' or fruit = 'consequential' do nothing; ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) where fruit like '%berry' do update set fruit = excluded.fruit; ERROR: there is no unique or exclusion constraint matching the ON CONFLICT specification -- I recommend David Deutsch's <> Jian