TOAST table size in bytes growing despite working autovacuum
Dear all, I have a table which contains a "json" column and it gets heavily updated. Before introducing toast.autovacuum_vacuum_scale_factor=0.05 and toast.autovacuum_vacuum_cost_limit=1000 this table bloated to nearly 1TB in a short while. Now the n_dead_tup value is nicely under control but still, the table is slowly growing in size but not in rows. The odd thing is that the value of n_live_tup in the TOAST is twice of that in the main table. I know it is a statistical value, but this does not feel right. Why is that? What to do to make it stop growing? select n_live_tup, n_dead_tup, last_autovacuum, autovacuum_count from pg_stat_all_tables where relname = 'player_data_states'; ─[ RECORD 1 ]┬─ n_live_tup │ 84730 n_dead_tup │ 8336 last_autovacuum │ 2020-06-15 08:23:58.88791+00 autovacuum_count │ 11306 select n_live_tup, n_dead_tup, last_autovacuum, autovacuum_count from pg_stat_all_tables where relname = 'pg_toast_293406'; ─[ RECORD 1 ]┬── n_live_tup │ 168486 n_dead_tup │ 9835 last_autovacuum │ 2020-06-15 08:33:22.566087+00 autovacuum_count │ 41021 The PG server is 11.7 (Debian 11.7-2.pgdg90+1) And the table is Column │ Type│ Nullable │ Storage ┼───┼──┼ id │ bigint│ not null │ plain cage_player_id │ bigint│ not null │ plain cage_code │ integer │ not null │ plain player_data│ json │ │ extended update_time│ timestamp with tz │ not null │ plain Indexes: "player_data_states_pk" PRIMARY KEY, btree (id) "player_data_states_uk1" UNIQUE CONSTRAINT, btree (cage_player_id, cage_code) Referenced by: TABLE "awards.player_data_state_changes" CONSTRAINT "player_data_state_changes_fk1" FOREIGN KEY (player_data_state_id) REFERENCES awards.player_data_states(id) Publications: "awards" Options: fillfactor=90, toast.autovacuum_vacuum_scale_factor=0.05, toast.autovacuum_vacuum_cost_limit=1000 Best regards -- Kristjan Mustkivi
Re: TOAST table size in bytes growing despite working autovacuum
On Mon, Jun 15, 2020 at 12:17 PM Laurenz Albe wrote: > > On Mon, 2020-06-15 at 11:51 +0300, Kristjan Mustkivi wrote: > > I have a table which contains a "json" column and it gets heavily > > updated. Before introducing toast.autovacuum_vacuum_scale_factor=0.05 > > and toast.autovacuum_vacuum_cost_limit=1000 this table bloated to > > nearly 1TB in a short while. Now the n_dead_tup value is nicely under > > control but still, the table is slowly growing in size but not in > > rows. The odd thing is that the value of n_live_tup in the TOAST is > > twice of that in the main table. I know it is a statistical value, but > > this does not feel right. > > > > Why is that? What to do to make it stop growing? > > It is not surprising if there are more entries in the TOAST table than > in the base table: a big value will be split in several chunks, > each of which is an entry in the TOAST table. > > To see if the TOAST table is bloated, use pgstattuples: > > SELECT * FROM pgstattuple('pg_toast.pg_toast_293406'); > > Vacuum does not remove existing bloat, it just prevents increased bloat. Thank you Laurenz, So the TOAST table entries exceeding the base table entries are due to that the toasted value is split and each chunk is considered as a separate entry - good to know! Still, pgstattuple reveals that the table size is 715MB while live tuple len is just 39MB and 94% of the table is vacant. I do not have much experience in interpreting this but it would seem that it is still getting bloated. Should the autovacuum be made even more aggressive? E.g toast.autovacuum_vacuum_scale_factor=0.01 instead of 0.05 and tweaked further when necessary until the size stabilizes (more precisely pgstattuple will reflect the bloat to be under control): SELECT * FROM pgstattuple('pg_toast.pg_toast_293406'); ─[ RECORD 1 ]──┬── table_len │ 715776000 tuple_count│ 25545 tuple_len │ 39241366 tuple_percent │ 5.48 dead_tuple_count │ 1116 dead_tuple_len │ 1930508 dead_tuple_percent │ 0.27 free_space │ 669701052 free_percent │ 93.56 With my the best, -- Kristjan Mustkivi
Re: TOAST table size in bytes growing despite working autovacuum
On Mon, Jun 15, 2020 at 4:37 PM Laurenz Albe wrote: > > On Mon, 2020-06-15 at 13:47 +0300, Kristjan Mustkivi wrote: > > Still, pgstattuple reveals that the table size is 715MB while live > > tuple len is just 39MB and 94% of the table is vacant. I do not have > > much experience in interpreting this but it would seem that it is > > still getting bloated. Should the autovacuum be made even more > > aggressive? E.g toast.autovacuum_vacuum_scale_factor=0.01 instead of > > 0.05 and tweaked further when necessary until the size stabilizes > > (more precisely pgstattuple will reflect the bloat to be under > > control): > > > > SELECT * FROM pgstattuple('pg_toast.pg_toast_293406'); > > ─[ RECORD 1 ]──┬── > > table_len │ 715776000 > > tuple_count│ 25545 > > tuple_len │ 39241366 > > tuple_percent │ 5.48 > > dead_tuple_count │ 1116 > > dead_tuple_len │ 1930508 > > dead_tuple_percent │ 0.27 > > free_space │ 669701052 > > free_percent │ 93.56 > > Indeed, the table is almost entirely air. > > You should schedule down time and run a VACUUM (FULL) on that table. > That will rewrite the table and get rid of the bloat. Hello! But in order to avoid the situation happening again (as it will with the current settings), I should likely make the autovacuuming on the TOAST table even more aggressive via toast.autovacuum_vacuum_scale_factor tinkering, right? Sorry to pester with this and thank you for the feedback - it is much appreciated! -- Kristjan Mustkivi
Re: Batch update million records in prd DB
Hi Yi, I found that in postgres the memory is slowly eaten away when doing updates within plsql loop. It only gets released once the whole block completes. While it is ok for small tables you will eventually run out of memory for really big ones. The working approach was to do the loop in e.g a python script that called the DML statements and also called commit. Several million rows is fortunately relatively small number to update but once you get to billions this approach would not likely work. Note that after each batch you also should call VACUUM before starting a new one to avoid significant table bloat. BR, Kristjan On Wed, Feb 24, 2021 at 3:01 PM Yi Sun wrote: > > Hello, > > Now need to update several million records in a table in prd DB, if can use > batch update 1000 records and commit each time, if it will affect prd > application like below sample script please? > > Sample script: > > DO $MAIN$ > DECLARE > affect_count integer; > chunk_size CONSTANT integer :=1000; > sleep_sec CONSTANT numeric :=0.1; > BEGIN > > loop > > exit when affect_count=0; > > UPDATE tbl a > SET name = '' > WHERE a.id IN (SELECT id >FROM tbl b >WHERE name IS NULL >LIMIT chunk_size); > > GET DIAGNOSTICS affect_count = ROW_COUNT; > > commit; > > PERFORM pg_sleep(sleep_sec); > > end loop; > END; > $MAIN$; > > Thanks and best regards -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
CREATE SUBSCRIPTION not picking up .pgpass while psql does
Hello, I do not understand why CREATE SUBSCRIPTION does not pick up .pgpass (when psql does): root@pg.newtest:/# psql 'host=pg.oldtest port=5432 user=pg_replication dbname=oldtest' oldtest=# \q root@pg.newtest:/# psql newtest postgres newtest=# CREATE SUBSCRIPTION sub_pgupgrade CONNECTION 'host=pg.oldtest port=5432 user=pg_replication dbname=oldtest' PUBLICATION pub_pgupgrade; ERROR: could not connect to the publisher: connection to server at "pg.oldtest" (x.x.x.x), port 5432 failed: fe_sendauth: no password supplied newtest=# oldtest is 11.15 (Debian 11.15-1.pgdg90+1) newtest is 14.5 (Debian 14.5-1.pgdg110+1) .pgpass is under /root (home inside the docker container) with 0600 permissions and owned by user "postgres". Even providing passfile=/root/.pgpass in the connstring does not work. .pgpass contains: *:*:*:pg_replication:password Any thoughts on how to debug much appreciated. Best regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Re: CREATE SUBSCRIPTION not picking up .pgpass while psql does
On Wed, Aug 31, 2022 at 4:07 PM hubert depesz lubaczewski wrote: > How/where you provide it? > > Why would you assume that postgres (running from user postgres > presumably) would look for pgpass in /root/.pgpass? > > postgres should have it in ~postgres/.pgpass > with proper ownership. Hi, depesz, I use the dockerized version of Postgres (both old and new cases). The docker-compose.yml file maps the .pgass file as volume (- ./v_pgpass:/root/.pgpass). Inside the container the home is /root: sudo docker exec -ti db bash root@pg.newtest:/# cd ~ root@pg.newtest:~# pwd /root root@pg.newtest:~# ls -la .pgpass -rw--- 1 postgres root 74 Aug 30 11:38 .pgpass And as said, the psql utility has no problems finding the .pgass where it is. If I lie to it about the pgpass location i.e by giving passfile=/root/.pgpassx it will ask for password. Why create subscription does not pick the .pgpass up like psql does, - that is what I cannot figure out. Cheers! -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Re: CREATE SUBSCRIPTION not picking up .pgpass while psql does
On Wed, Aug 31, 2022 at 4:27 PM hubert depesz lubaczewski wrote: > > On Wed, Aug 31, 2022 at 04:26:22PM +0300, Kristjan Mustkivi wrote: > > And as said, the psql utility has no problems finding the .pgass where > > it is. If I lie to it about the pgpass location i.e by giving > > passfile=/root/.pgpassx it will ask for password. > > of course it doesn't have problem, because you run it as root. > put subscription is run by pg backend, which runs as postgres user, not > root! > > also - no pg tool cares about "passfile" env variable, so it is just > ignored. Thank you, depesz! After I mapped the volume as (- ./v_pgpass:/var/lib/postgresql/.pgpass) CREATE SUBSCRIPTION was able to find it. Somehow my initial POC fooled me then I kept barking under the wrong tree. Best Regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Index corruption revealed after upgrade to 11.17, could date back to at least 11.12
Dear community, Right after upgrading our postgres servers from 11.15 to 11.17 we started to encounter problems with data. Namely, when the query hit the index, it returned a single row; when the query hit a relation directly, it returned more than one row. Attempt to REINDEX revealed the underlying data had duplicates (unique index reindexing failed). Version facts: we started out with 11.12 jan 2022 upgraded to 11.14 mar 2022 to 11.15 oct 2022 to 11.17 We are not sure when this corruption actually happened. Could it be related to the indexing bugs reported in https://www.postgresql.org/docs/release/11.14/? And the condition only became known to us after 11.17 rollout which can perhaps be explained by the following: while 11.17 does not have any outstanding index related fixes, then https://www.postgresql.org/docs/release/11.15/ mentions fix for index-only scans and so does https://www.postgresql.org/docs/release/11.16/. The bottom line is we would like to understand if the index corruption and its manifestation is explained by the above release fixes or is there something else that should be investigated further here with the help from the community. With best regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12
Hi Allan, We use dockerized postgres. So the upgrade is just replacing the old minor version pg docker image with new minor version docker image and that's it. Of course, I am checking the release notes to see if there is anything to pay attention to particularly. I do apologize, but I do not understand the value of doing that select juggling. I have been searching for and fixing the problematic rows with ctid (and xmin, xmax to help establish the ones to remove) and this has been effective in indicating the discrepancy between actual data in the table and the corruption in the e.g Primary Key index. Also, playing around with enable_indexscan, enable_bitmapscan, enable_seqscan prove the same problem with an index. After deleting the offending rows by ctid, REINDEX-ing is possible. (As these are production systems, some of the relations had to be fixed right away.) This case is most likely to do with some bug and the release notes for 11.14, .15 and .16 seem to explain the current situation. What would be good to know is that this particular bug has been fixed and is not one yet to be uncovered. Best regards, Kristjan On Thu, Oct 27, 2022 at 10:41 AM Allan Kamau wrote: > > > > On Thu, Oct 27, 2022 at 10:26 AM Allan Kamau wrote: >> >> >> >> On Thu, Oct 27, 2022 at 10:20 AM Kristjan Mustkivi >> wrote: >>> >>> Dear community, >>> >>> Right after upgrading our postgres servers from 11.15 to 11.17 we >>> started to encounter problems with data. Namely, when the query hit >>> the index, it returned a single row; when the query hit a relation >>> directly, it returned more than one row. Attempt to REINDEX revealed >>> the underlying data had duplicates (unique index reindexing failed). >>> >>> Version facts: >>> we started out with 11.12 >>> jan 2022 upgraded to 11.14 >>> mar 2022 to 11.15 >>> oct 2022 to 11.17 >>> >>> We are not sure when this corruption actually happened. Could it be >>> related to the indexing bugs reported in >>> https://www.postgresql.org/docs/release/11.14/? And the condition only >>> became known to us after 11.17 rollout which can perhaps be explained >>> by the following: while 11.17 does not have any outstanding index >>> related fixes, then https://www.postgresql.org/docs/release/11.15/ >>> mentions fix for index-only scans and so does >>> https://www.postgresql.org/docs/release/11.16/. >>> >>> The bottom line is we would like to understand if the index corruption >>> and its manifestation is explained by the above release fixes or is >>> there something else that should be investigated further here with the >>> help from the community. >>> >>> With best regards, >>> -- >>> Kristjan Mustkivi >>> >>> Email: kristjan.mustk...@gmail.com >>> >>> >> Hi Kristjan, >> What if you construct a select statement containing the row id and the >> column which has the problematic index into a new table. Then perform >> queries on this table to test for uniqueness of the column on which the >> problematic index was reported. >> >> Allan. > > > How was the data "transfer" between upgrades done? Was it by dump and restore? > If you have the 11.15 instance running having the data, you may do the > selection of the row id and the specific column which the index is based into > a new table and perform queries on this too to determine uniqueness of the > values therein. Likewise do the same for the 11.17 version. > > Is it possible to build and install PG 15 from source on a different > directory (using --prefix ) then perform pg_dump using the binaries of this > installation into a directory. Then configure PG 15 installation to listen on > a different TCP/IP port to the one you are currently using with 11.17 > instance. Once started, test to see if the index anomaly is present in the PG > 15 instance. Alternatively you may use the PG 15 docker image and docker to > start a PG 15 docker container for your tests instead of having to build and > install PG 15 for this test. > > -Allan > > > -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12
On Thu, Oct 27, 2022 at 12:18 PM Peter J. Holzer wrote: > > On 2022-10-27 10:55:31 +0300, Kristjan Mustkivi wrote: > > We use dockerized postgres. > > So that means you aren't just replacing PostgreSQL, but your complete OS > (except the kernel). What is the source of your docker images? Do they > all use the same base OS distribution? Are the locale definitions the > same? > > (Just trying to rule other other possible error sources.) Hello! Up until 11.17, the source of the docker images was tag "postgres:11" (from https://hub.docker.com/_/postgres), for 11.17 the tag became "postgres:11-bullseye" but as far as i could tell it was just a difference of tagging policy there. Everything else is kept the same when building our custom docker image (with pg_cron, wal2json and oracle_fdw). But.. I can see for example, that the PG 11.12 docker image used Debian 9.13 (PG 11.17 uses Debian 11.5 according to /etc/debian_version). So the official docker images also upgrade the OS (which is expected I suppose). Best regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Re: Index corruption revealed after upgrade to 11.17, could date back to at least 11.12
On Fri, Oct 28, 2022 at 2:41 AM Peter J. Holzer wrote: > > > Up until 11.17, the source of the docker images was tag "postgres:11" > > (from https://hub.docker.com/_/postgres), for 11.17 the tag became > > "postgres:11-bullseye" but as far as i could tell it was just a > > difference of tagging policy there. Everything else is kept the same > > when building our custom docker image (with pg_cron, wal2json and > > oracle_fdw). But.. I can see for example, that the PG 11.12 docker > > image used Debian 9.13 (PG 11.17 uses Debian 11.5 according to > > /etc/debian_version). > > Ok, So that's an ugrade from Debian 9 to Debian 11. That's definitely > not just a "difference of tagging policy", That's two major versions of > the OS! > > I don't remember exactly when the big incompatible libc upgrade was, but > it was very likely somewhere between Debian 9 and Debian 11, so you have > to rebuild all you indexes. Since you didn't do that immediately after > the upgrade you now have data corruption which you have to fix manually. Well, I am going to remember this lesson for the rest of my life. After using this same scheme for upgrading the pg-s for over 3 years, this really caught me as a surprise - apparently I got comfortable and I did not pay close attention to this particular change with the docker hub images. We have reverted back to our previous version as the comparison based on queries showed the data to be more intact with it. By the way, index rebuild while completing successfully did not fix the indexes - the data in the tables was still missing even after the successful rebuild command. I guess the only option left is to drop and re-create these one by one. Thank you all for feedback and help! With best regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Logical Streaming Replication stuck at specific LSN
Hello, I am doing a major version upgrade from Postgres standard version 11.15 (Debian 11.15-1.pgdg90+1) to Cybertec PGEE version 14.8_EE_1.1.5 (Ubuntu 14.8ee1.1.5-1.cybertec22.04+1). While I am waiting for the support case to be processed, I was hoping to get advice from the Community about how to check the situation meanwhile on my own. The initial copy of the tables completed nicely and from Aug 18 6pm until Aug 19 6am the streaming flowed without issues. And then suddenly the WAL sending/write/flsh/reply got stuck: application_name │ state │sent_lsn│ write_lsn│ flush_lsn│ replay_lsn ──┼───┼┼┼┼ sub_whupgrade│ streaming │ 3F5B3/7FBDDCD0 │ 3F5B3/7FBD49E0 │ 3F5B3/7FBD49E0 │ 3F5B3/7FBD49E0 I restarted the subscriber but the only thing that happened was that the LSN got now stuck with status "catchup" at exactly the same place. I upgraded PGEE to version 14.9 EE 1.1.6 (Ubuntu 14.9ee1.1.6-1.cybertec22.04+1) but that also did not change the situation. With best regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
How to rename in-use logical replication publication?
Hello! Postgres v11.12. Getting "ERROR: publication "new_rep_pub" does not exist" after renaming an existing publication. And the only way to get it working seems to start from scratch. What am I missing? On PUB side: CREATE TABLE rep_test (int_col int PRIMARY KEY, ts_col timestamp); INSERT INTO rep_test (int_col, ts_col) SELECT generate_series(1,10), current_timestamp; CREATE PUBLICATION rep_pub FOR TABLE rep_test; On SUB side: CREATE TABLE rep_test (int_col int PRIMARY KEY, ts_col timestamp) CREATE SUBSCRIPTION rep_sub CONNECTION 'host=localhost port=54311 dbname=postgres' PUBLICATION rep_pub; ALTER SUBSCRIPTION rep_sub DISABLE ; ALTER SUBSCRIPTION rep_sub SET PUBLICATION new_rep_pub WITH ( refresh = false ); On PUB side: ALTER PUBLICATION rep_pub RENAME TO new_rep_pub; On SUB side: ALTER SUBSCRIPTION rep_sub ENABLE ; The PUB complains: 2021-10-19 19:14:53.182 GMT [35] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2021-10-19 19:14:53.185 GMT [35] LOG: starting logical decoding for slot "rep_sub" 2021-10-19 19:14:53.185 GMT [35] DETAIL: Streaming transactions committing after 0/1678A10, reading WAL from 0/1678A10. 2021-10-19 19:14:53.186 GMT [35] LOG: logical decoding found consistent point at 0/1678A10 2021-10-19 19:14:53.186 GMT [35] DETAIL: There are no running transactions. 2021-10-19 19:14:53.186 GMT [35] ERROR: publication "new_rep_pub" does not exist 2021-10-19 19:14:53.186 GMT [35] CONTEXT: slot "rep_sub", output plugin "pgoutput", in the change callback, associated LSN 0/1678A48 The SUB complains: 2021-10-19 19:16:10.340 GMT [33] LOG: logical replication apply worker for subscription "rep_sub" has started 2021-10-19 19:16:10.355 GMT [33] ERROR: could not receive data from WAL stream: ERROR: publication "new_rep_pub" does not exist CONTEXT: slot "rep_sub", output plugin "pgoutput", in the change callback, associated LSN 0/1678A48 2021-10-19 19:16:10.361 GMT [1] LOG: background worker "logical replication worker" (PID 33) exited with exit code 1 2021-10-19 19:16:15.373 GMT [34] LOG: logical replication apply worker for subscription "rep_sub" has started 2021-10-19 19:16:15.378 GMT [34] ERROR: could not receive data from WAL stream: ERROR: publication "new_rep_pub" does not exist CONTEXT: slot "rep_sub", output plugin "pgoutput", in the change callback, associated LSN 0/1678A48 2021-10-19 19:16:15.380 GMT [1] LOG: background worker "logical replication worker" (PID 34) exited with exit code 1 Yet it all looks like it should be OK: postgres@postgres=# \dRp List of publications Name │ Owner │ All tables │ Inserts │ Updates │ Deletes │ Truncates ─┼──┼┼─┼─┼─┼─── new_rep_pub │ postgres │ f │ t │ t │ t │ t (1 row) postgres@postgres=# \dRs+ List of subscriptions Name │ Owner │ Enabled │ Publication │ Synchronous commit │ Conninfo ─┼──┼─┼───┼────┼─── rep_sub │ postgres │ f │ {new_rep_pub} │ off│ host=localhost port=54311 dbname=postgres (1 row) Best regards, -- Kristjan Mustkivi Email: kristjan.mustk...@gmail.com
Re: How to rename in-use logical replication publication?
On Tue, Oct 19, 2021 at 10:31 PM Kristjan Mustkivi wrote: > Postgres v11.12. Getting "ERROR: publication "new_rep_pub" does not > exist" after renaming an existing publication. And the only way to get > it working seems to start from scratch. What am I missing? Ok, the recipe works perfectly in v13.4 which leads me to believe it is a bug in v11.12. Best regards, Kristjan Mustkivi