PostgreSQL statistics
Hi, I have a table in a PostgreSQL 11 server that isn't being used a lot. If I run a manual ANALYSE on it the last_analyse statistic isn't being update (I've waited enough time for it to be updated): # SELECT * FROM pg_catalog.pg_stat_user_tables WHERE schemaname='s1' and relname='t1'; relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum |last_autovacuum| last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count ---+--+---+--+--+--+---+---+---+---+---+++-+-+---+--+--+--+--+---+--- 237177743 | s1 | t1 |0 |0 |0 | 0 | 0 | 0 | 0 | 0 | 4820 | 0 | 0 | | 2023-11-03 13:34:23.725802+00 | | |0 |1 | 0 | 0 (1 row) But if I run a "SELECT * FROM s1.t1" I see the seq_scan increase to 1 (as expected) and after that I can issue a manual ANALYZE and the last_analyze gets updated. Is this behaviour normal? Are there some parameters that are verified that prevent the ANALYSE from running? Regards -- Paulo Silva
Re: PostgreSQL statistics
On 11/17/23 02:25, Paulo Silva wrote: Hi, I have a table in a PostgreSQL 11 server that isn't being used a lot. If I run a manual ANALYSE on it the last_analyse statistic isn't being update (I've waited enough time for it to be updated): # SELECT * FROM pg_catalog.pg_stat_user_tables WHERE schemaname='s1' and relname='t1'; relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count ---+--+---+--+--+--+---+---+---+---+---+++-+-+---+--+--+--+--+---+--- 237177743 | s1 | t1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4820 | 0 | 0 | | 2023-11-03 13:34:23.725802+00 | | | 0 | 1 | 0 | 0 (1 row) But if I run a "SELECT * FROM s1.t1" I see the seq_scan increase to 1 (as expected) and after that I can issue a manual ANALYZE and the last_analyze gets updated. Is this behaviour normal? Are there some parameters that are verified that prevent the ANALYSE from running? The oldest Postgres instance I have access to is version 12 and I see the update to pg_stat_user_tables when I run ANALYZE in psql in autocommit. Are you maybe seeing the effects of this?: https://www.postgresql.org/docs/11/monitoring-stats.html "Another important point is that when a server process is asked to display any of these statistics, it first fetches the most recent report emitted by the collector process and then continues to use this snapshot for all statistical views and functions until the end of its current transaction. So the statistics will show static information as long as you continue the current transaction. Similarly, information about the current queries of all sessions is collected when any such information is first requested within a transaction, and the same information will be displayed throughout the transaction." As in: begin; select relname, last_analyze from pg_stat_user_tables where relname = 'cell_per'; -[ RECORD 1 ]+-- relname | cell_per last_analyze | 2023-11-17 08:11:18.614987-08 analyze cell_per ; ANALYZE select relname, last_analyze from pg_stat_user_tables where relname = 'cell_per'; -[ RECORD 1 ]+-- relname | cell_per last_analyze | 2023-11-17 08:11:18.614987-08 commit; select relname, last_analyze from pg_stat_user_tables where relname = 'cell_per'; -[ RECORD 1 ]+-- relname | cell_per last_analyze | 2023-11-17 08:13:02.969537-08 Regards -- Paulo Silva mailto:paulo...@gmail.com>> -- Adrian Klaver adrian.kla...@aklaver.com
Unused CTE affects result set
Dear Postgres community, I'm seeing unintuitive behavior with a query structured as follows: WITH used_cte (SELECT a, b, c FROM t1 ...) , unused_cte (SELECT d, e, f FROM t2 ...) SELECT * FROM used_cte ... -- returns no results WITH used_cte (SELECT a, b, c FROM t1 ...) -- , unused_cte (SELECT d, e, f FROM t2 ...) SELECT * FROM used_cte ... -- returns results I don't understand why the presence of a CTE that ends up not being included in the final query affects the result set. Could this be a bug, or is there something I don't understand? PostgreSQL 14.9 (Ubuntu 14.9-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit Thanks, Gabriel
Re: Unused CTE affects result set
On 11/17/23 10:30 AM, Gabriel Sánchez wrote: Dear Postgres community, I'm seeing unintuitive behavior with a query structured as follows: WITH used_cte (SELECT a, b, c FROM t1 ...) , unused_cte (SELECT d, e, f FROM t2 ...) SELECT * FROM used_cte ... -- returns no results WITH used_cte (SELECT a, b, c FROM t1 ...) -- , unused_cte (SELECT d, e, f FROM t2 ...) SELECT * FROM used_cte ... -- returns results I don't understand why the presence of a CTE that ends up not being included in the final query affects the result set. Could this be a bug, or is there something I don't understand? I would say it has to do with what '...' is doing? In other words I don't see this behavior here for a simple case. PostgreSQL 14.9 (Ubuntu 14.9-0ubuntu0.22.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, 64-bit Thanks, Gabriel
Re: Unused CTE affects result set
Adrian Klaver writes: > On 11/17/23 10:30 AM, Gabriel Sánchez wrote: >> I don't understand why the presence of a CTE that ends up not being >> included in the final query affects the result set. Could this be a >> bug, or is there something I don't understand? > I would say it has to do with what '...' is doing? We'd need to see a complete example to do more than speculate. The planner definitely does skip unreferenced SELECT CTEs, though. regards, tom lane
Re: Trigger to Count Number of Logical Replication Table Changes.
Hi The only thing that can be established is that if the record does not exist, the trigger will not be updating anything in the table "tst.time_audit_tbl" for the condition "table_name = CONCAT(TG_TABLE_SCHEMA, '.', TG_TABLE_NAME)" Maybe, checking before the UPDATE, if the record does not exist, add it. IF NOT EXISTS (SELECT 1 FROM tst.time_audit_tbl WHERE table_name = CONCAT(TG_TABLE_SCHEMA, '.', TG_TABLE_NAME)) THEN INSERT INTO FROM tst.time_audit_tbl (table_name) VALUES (CONCAT(TG_TABLE_SCHEMA, '.', TG_TABLE_NAME)); END IF Atte. JRBM El jue, 16 nov 2023 a las 14:19, Avi Weinberg () escribió: > I'm using Postgres (13 and 15) logical replication to sync data from two > servers. I would like to have an update counter whenever data is changed. > The counter can be incremented by 1 even if multiple rows are updated, but > it is also ok to be incremented the counter by the number of rows updated > (but it seems less efficient to me). > I need the counter to increase after initial sync as well as after regular > logical replication sync. > > Triggers not to work without ENABLE ALWAYS. > > In addition, If I try trigger that is "FOR EACH STATEMENT" it works only > for initial sync and not for regular logical replication sync. > > Having per row set_time_trig trigger takes about 1 minute when updating > 50k rows in one transaction (all I need is to increase update_count by 1, > why spend 1 minute for it) . How can I improve this? > > CREATE TABLE IF NOT EXISTS tst.t2 > > ( > > id bigint NOT NULL, > > c1 int, > > CONSTRAINT pk_t2 PRIMARY KEY (id) > > ); > > > > CREATE TABLE IF NOT EXISTS tst.time_audit_tbl > > ( > > table_name character varying(63) COLLATE pg_catalog."default" NOT NULL, > > update_count integer DEFAULT 0, > > CONSTRAINT updated_time_audit_unique UNIQUE (table_name) > > ); > > > > > > CREATE FUNCTION tst.set_time() RETURNS trigger > > LANGUAGE plpgsql SECURITY DEFINER > > AS $$ > > DECLARE > > updated_count int; > > BEGIN > > UPDATE tst.time_audit_tbl SET update_count = update_count + 1 > WHERE table_name = CONCAT(TG_TABLE_SCHEMA, '.', TG_TABLE_NAME); > > GET DIAGNOSTICS updated_count = ROW_COUNT; > > IF updated_count = 0 THEN > > RAISE EXCEPTION 'set_updated_time(). Table not found %.%', > TG_TABLE_SCHEMA, TG_TABLE_NAME; > > END IF; > > > > RETURN coalesce(NEW, OLD); > > END; > > $$; > > > > > > CREATE TRIGGER set_time_trig > > AFTER INSERT OR DELETE OR UPDATE > > ON tst.t2 > > FOR EACH ROW > > EXECUTE FUNCTION tst.set_time(); > > > > ALTER TABLE tst.t2 ENABLE ALWAYS TRIGGER set_time_trig; > > > IMPORTANT - This email and any attachments is intended for the above named > addressee(s), and may contain information which is confidential or > privileged. If you are not the intended recipient, please inform the sender > immediately and delete this email: you should not copy or use this e-mail > for any purpose nor disclose its contents to any person. >
query plan
Hi, This is part of a query plan: Nested Loop Left Join (cost=26.32..47078866.36 rows=1344945195 width=626) -> Nested Loop Left Join (cost=25.74..5312.48 rows=1344945195 width=608) -> Nested Loop Left Join (cost=6.79..2876.77 rows=102 width=373) -> Nested Loop Left Join (cost=1.90..1965.51 rows=102 width=361) -> Bitmap Heap Scan on ... (cost=4.89..8.91 rows=2 width=28) -> Hash Left Join (cost=18.95..42.61 rows=3 width=243) -> Hash Left Join (cost=18.94..42.59 rows=3 width=203) -> Hash (cost=0.00..0.00 rows=1 width=48) -> Memoize (cost=0.58..4.59 rows=1 width=172) What I don't understand is this. The left node of the join is expected to return 102 rows. The right node 3. How can this result in >1e9 rows? The query involved way further down a partitioned table with 2 partitions, one pretty big in the 1e9 rows range, the other practically empty. The big partition had been analyzed before. But the partitioned table and the empty partition never. After analyzing them all was well. I am just curious to understand how that number is calculated. This is PG14. Thanks.
Re: query plan
=?UTF-8?Q?Torsten_F=C3=B6rtsch?= writes: > This is part of a query plan: > Nested Loop Left Join (cost=26.32..47078866.36 rows=1344945195 width=626) >-> Nested Loop Left Join (cost=25.74..5312.48 rows=1344945195 > width=608) > -> Nested Loop Left Join (cost=6.79..2876.77 rows=102 width=373) >-> Nested Loop Left Join (cost=1.90..1965.51 rows=102 > width=361) >-> Bitmap Heap Scan on ... (cost=4.89..8.91 rows=2 > width=28) > -> Hash Left Join (cost=18.95..42.61 rows=3 width=243) >-> Hash Left Join (cost=18.94..42.59 rows=3 width=203) >-> Hash (cost=0.00..0.00 rows=1 width=48) >-> Memoize (cost=0.58..4.59 rows=1 width=172) > What I don't understand is this. The left node of the join is expected to > return 102 rows. The right node 3. How can this result in >1e9 rows? The rowcount estimate for the join is not derived by multiplying the rowcount estimates of the two inputs. Having said that, this looks pretty inconsistent. Can you show a test case that acts like that? regards, tom lane