checkpoint occurs very often when vacuum full running

2018-11-15 Thread Mariel Cherkassky
Hi,
Can someone explain the logic behind it ? I know that vacuum full isnt
something recommended but I found out that whenever I run vacuum full on my
database checkpoint occurs during that time every second ! well I know that
VACUUM FULL duplicates the data into new data files and then it deletes the
old data files. The writing the vacuum does, is it with the checkpoint
process ?

Is there any connection ?


Thanks.


Re: checkpoint occurs very often when vacuum full running

2018-11-15 Thread Sergei Kornilov
Hi

Checkpoint can be occurs due timeout (checkpoint_timeout) or due amount of WAL 
(max_wal_size).
Vacuum full does write all data through WAL and therefore may trigger 
checkpoint more frequently.

regards, Sergei



Re: checkpoint occurs very often when vacuum full running

2018-11-15 Thread Mariel Cherkassky
First of all thank you for the quick answer. In my case checkpoint happened
every one second during the vacuum full so the checkpoint timeout isn't
relevant. My guess was that it writes the changes to the wals but I didn't
find anything about it in the documentation. Can you share a link that
proves it ? I mean basicly the wals should contain the changes, and vacuum
full changes the location of the data and not actually the data.

On Thu, Nov 15, 2018, 9:10 PM Sergei Kornilov  Hi
>
> Checkpoint can be occurs due timeout (checkpoint_timeout) or due amount of
> WAL (max_wal_size).
> Vacuum full does write all data through WAL and therefore may trigger
> checkpoint more frequently.
>
> regards, Sergei
>


Re: checkpoint occurs very often when vacuum full running

2018-11-15 Thread Sergei Kornilov
Hi

> I mean basicly the wals should contain the changes, and vacuum full changes 
> the location of the data and not actually the data.
Row location is data. For example, index lookup relies on TID (tuple id, hidden 
ctid column) - physical row address in datafile.
Postgresql WAL - it is about physical changes in datafiles (block level), not 
logical. Just moving one row to another place without logical changes means: 
mark row deleted in old place, write to new place and update every index which 
contains this row.
And vacuum full does not change location, it create copy in different datafile. 
Then it rebuild every index because TID was obviously changed. Then vacuum full 
drop old datafiles. Full size of new datafile and indexes should be written to 
WAL, because all of this is changes and must be reliable written (and then can 
be replayed on replicas).

> but I didn't find anything about it in the documentation
hmm, i can not found something exact in documentation about it.. It's my 
knowledge about postgresql internals.
You can read this article: 
https://www.depesz.com/2011/07/14/write-ahead-log-understanding-postgresql-conf-checkpoint_segments-checkpoint_timeout-checkpoint_warning/
 Its about WAL logic. All IO operations use pages, and difference between pages 
written to WAL.
For example, full_page_writes setting ( 
https://www.postgresql.org/docs/current/runtime-config-wal.html#GUC-FULL-PAGE-WRITES
 ) say about pages too.
> writes the entire content of each disk page to WAL during the first 
> modification of that page after a checkpoint.
If you want change few bytes in page - the whole page (8kb typical) will be 
written to WAL during first change of this page after checkpoint.

regards, Sergei



Re: checkpoint occurs very often when vacuum full running

2018-11-15 Thread Justin Pryzby
Hi,

Please don't cross post to multiple lists.

On Thu, Nov 15, 2018 at 08:53:14PM +0200, Mariel Cherkassky wrote:
> Can someone explain the logic behind it ? I know that vacuum full isnt
> something recommended but I found out that whenever I run vacuum full on my
> database checkpoint occurs during that time every second ! well I know that
> VACUUM FULL duplicates the data into new data files and then it deletes the
> old data files. The writing the vacuum does, is it with the checkpoint
> process ?

It's a good question.  What version postgres are you using, and what is the
setting of wal_level ?

On Thu, Nov 15, 2018 at 11:28:40PM +0300, Sergei Kornilov wrote:
> Row location is data. For example, index lookup relies on TID (tuple id, 
> hidden ctid column) - physical row address in datafile.

But, since VAC FULL has an exclusive lock, and since it's atomic (it's either
going to succeed and use the new table or interrupted or otherwise fail and
continue using the old table data), I it doesn't need to write to WAL, except
if needed for physical replication.  Same as CREATE TABLE AS and similar.  In
my test, setting wal_level=minimal seemed to avoid WAL writes from vac full.

https://www.postgresql.org/docs/current/populate.html#POPULATE-PITR

Justin