Re: PG17 optimizations to vacuum

2024-09-02 Thread Heikki Linnakangas

On 03/09/2024 00:11, Heikki Linnakangas wrote:

On 03/09/2024 00:01, Peter Geoghegan wrote:
On Mon, Sep 2, 2024 at 4:58 PM Heikki Linnakangas  
wrote:

Do you have any non-default settings? "select name,
current_setting(name), source  from pg_settings where setting <>
boot_val;" would show that.


What about page checksums?

One simple explanation is that we're writing extra FPIs to set hint
bits. But that explanation only works if you assume that page-level
checksums are in use (or that wal_log_hints is turned on).


Hmm, yeah, that might be it. With chceksums enabled, I see ~120k WAL 
records, vs ~90k without checksums. But there's no difference between 
v16 and master.


Looking at the pg_waldump output from this test:

... > rmgr: XLOGlen (rec/tot): 49/  8209, tx:  0, lsn: 
0/FE052AA8, prev 0/FE0528A8, desc: FPI_FOR_HINT , blkref #0: rel 
1663/5/16396 blk 73 FPW

rmgr: Heap2   len (rec/tot):507/   507, tx:  0, lsn: 
0/FE054AD8, prev 0/FE052AA8, desc: PRUNE snapshotConflictHorizon: 754, 
nredirected: 0, ndead: 226, nunused: 0, redirected: [], dead: [1, 2, 3, 4, 5, 
6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 
47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 
105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 
121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 
137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 
153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 
169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 
185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 
201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 
217, 218, 219, 220, 221, 222, 223, 224, 225, 226], unused: [], blkref #0: rel 
1663/5/16396 blk 73
rmgr: XLOGlen (rec/tot): 49/  8209, tx:  0, lsn: 
0/FE054CD8, prev 0/FE054AD8, desc: FPI_FOR_HINT , blkref #0: rel 1663/5/16396 
blk 74 FPW
rmgr: Heap2   len (rec/tot):507/   507, tx:  0, lsn: 
0/FE056D08, prev 0/FE054CD8, desc: PRUNE snapshotConflictHorizon: 754, 
nredirected: 0, ndead: 226, nunused: 0, redirected: [], dead: [1, 2, 3, 4, 5, 
6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 
47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 
105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 
121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 
137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 
153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 
169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 
185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 
201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 
217, 218, 219, 220, 221, 222, 223, 224, 225, 226], unused: [], blkref #0: rel 
1663/5/16396 blk 74
...


The pattern of WAL records with checksums enabled is silly: For each 
page, we first write an FPI record, an immediately after that a PRUNE 
record that removes all the tuples on it, leaving the page empty.


This is the same with v16 and v17, but we certainly left money on the 
table by not folding that FPI into the VACUUM/PRUNE record.


--
Heikki Linnakangas
Neon (https://neon.tech)



Re: PG17 optimizations to vacuum

2024-09-02 Thread Heikki Linnakangas

On 02/09/2024 23:35, Pavel Luzanov wrote:

On 02.09.2024 22:23, Melanie Plageman wrote:

For some reason I stopped being able to reproduce Pavel's case.


I also cannot reproduce this.


I repeated the test on another computer, but compared master with v15.
The results are the same. The test can be simplified as follows:

CREATE TABLE t(id integer) WITH (autovacuum_enabled = off);
INSERT INTO t SELECT gen.id FROM generate_series(1,3_500_000) gen(id);
CREATE INDEX t_id ON t(id);
VACUUM FREEZE t;
UPDATE t SET id = id + 1;
VACUUM FREEZE VERBOSE t;

My results (only line with WAL info from the last VACUUM command).

master:
WAL usage: 119583 records, 37231 full page images, 272631468 bytes

v15:
WAL usage: 96565 records, 47647 full page images, 217144602 bytes


Can you dump the stats with pg_waldump please. Something like:

CREATE TABLE t(id integer) WITH (autovacuum_enabled = off);
INSERT INTO t SELECT gen.id FROM generate_series(1,3_500_000) gen(id);
CREATE INDEX t_id ON t(id);
SET maintenance_work_mem = '1MB';
UPDATE t SET id = id + 1;
select pg_current_wal_insert_lsn();   -- <<< PRINT WAL POS BEFORE VACUUM
VACUUM FREEZE VERBOSE t;

And then:

pg_waldump -p data/pg_wal/ -s 1/F4474498 --stats=record

where "1/F4474498" is the position printed by the "SELECT 
pg_current_wal_insert_lsn()" above.



Do you have any non-default settings? "select name, 
current_setting(name), source  from pg_settings where setting <> 
boot_val;" would show that.


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: PG17 optimizations to vacuum

2024-09-02 Thread Heikki Linnakangas

On 03/09/2024 00:01, Peter Geoghegan wrote:

On Mon, Sep 2, 2024 at 4:58 PM Heikki Linnakangas  wrote:

Do you have any non-default settings? "select name,
current_setting(name), source  from pg_settings where setting <>
boot_val;" would show that.


What about page checksums?

One simple explanation is that we're writing extra FPIs to set hint
bits. But that explanation only works if you assume that page-level
checksums are in use (or that wal_log_hints is turned on).


Hmm, yeah, that might be it. With chceksums enabled, I see ~120k WAL 
records, vs ~90k without checksums. But there's no difference between 
v16 and master.


Pavel, did you test v17 with checksums enabled and v16 with checksums 
disabled, by any chance?


--
Heikki Linnakangas
Neon (https://neon.tech)





Re: PG17 optimizations to vacuum

2024-09-03 Thread Heikki Linnakangas

On 03/09/2024 10:34, Pavel Luzanov wrote:

On 03.09.2024 00:11, Heikki Linnakangas wrote:
Pavel, did you test v17 with checksums enabled and v16 with checksums 
disabled, by any chance?


Exactly, You are right!

My v16 cluster comes from the default Ubuntu distribution.
I forgot that checksums disabled by default.
But when I initialize the master cluster, I automatically set -k option.

More accurate results for the test:

CREATE TABLE t(id integer) WITH (autovacuum_enabled = off);
INSERT INTO t SELECT gen.id FROM generate_series(1,3_500_000) gen(id);
VACUUM FREEZE t;
UPDATE t SET id = id + 1;
VACUUM FREEZE VERBOSE t;

checksums disabled
v16.4   WAL usage: 77437 records, 20832 full page images, 110654467 bytes
master  WAL usage: 61949 records, 20581 full page images, 92549229 bytes

checksums enabled
v16.4   WAL usage: 92848 records, 20985 full page images, 194863720 bytes
master  WAL usage: 76520 records, 20358 full page images, 181867154 bytes


That's more like it :-)


This a great optimization!

Peter, Melanie, Heikki, 
Thank you very much for your help and time spent!

Sorry for the noise before the release of PG17.


Thanks for the testing!

--
Heikki Linnakangas
Neon (https://neon.tech)