How to recover correctly master and replica using backup made by pg_basebackup?
Hello. I try to use pg_basebackup to make backup and recover master and replica from it. I recover master with action "promote". Next, I recover replica from the same backup with action "shutdown". After it, I start replica with configured connection and replication slot to the master. I get error, because replica reads and applies all WAL of irrelevant timeline, which contains logs after restore point. I wrote Perl script for pg_basebackup's TAP tests. It is attached as file "050_check_replication.pl". It works on PostgreSQL 17, commit 7d85d87f4d5c35fd5b2d38adaef63dfbfa542ccc after applying small patch, which adds possibility to start node with shutdown option. Patch is attached as file "add_node_start_with_shutdown.diff". Script generates situation when timeline changes several times. Small schema may explain it: Timeline 1: Create the first backup Create restore point 'rp_repl_bug' Timeline 1 -> 2: Recovery with promote using the first backup and 'rp_repl_bug' Create the second backup Create restore point 'rp_repl_bug2' Timeline 2 -> 3: Recovery with promote using the second backup and 'rp_repl_bug2' Timeline 1 -> 4: Recovery with promote using the first backup and 'rp_repl_bug' After it I get "0004.history" file, which looks strange: 1 0/390 at restore point "rp_repl_bug" 2 0/590 at restore point "rp_repl_bug2" 3 0/390 at restore point "rp_repl_bug" Next, I try to restore replica from the first backup. Replica did not find WAL for timeline 4 and downloaded and applied WAL for timeline 2. cp: cannot stat '${ARCHIVE_PATH}/00040003': No such file or directory LOG: redo starts at 0/228 cp: cannot stat '${ARCHIVE_PATH}/00040003': No such file or directory '${ARCHIVE_PATH}/00020003' -> '${REPLICA_PGDATA}/pg_wal/RECOVERYXLOG' LOG: restored log file "00020003" from archive LOG: completed backup recovery with redo LSN 0/228 and end LSN 0/2000120 LOG: consistent recovery state reached at 0/2000120 LOG: recovery stopping at restore point "rp_repl_bug", time 2024-11-05 22:00:16.037126+03 Than, at the next replica's start (with configured connection to the master and replication slot) replica can not start receiving WAL from master, because it applied all WAL from timeline 2 including all lines after LSN, where it should switch timeline: LOG: entering standby mode LOG: redo starts at 0/228 LOG: consistent recovery state reached at 0/390 LOG: database system is ready to accept read-only connections LOG: started streaming WAL from primary at 0/400 on timeline 4 FATAL: could not receive data from WAL stream: ERROR: requested starting point 0/400 is ahead of the WAL flush position of this server 0/3002000 If I restart replica, it will not start: LOG: database system was shut down in recovery at 2024-11-05 22:00:18 MSK LOG: entering standby mode FATAL: requested timeline 4 is not a child of this server's history DETAIL: Latest checkpoint is at 0/390 on timeline 2, but in the history of the requested timeline, the server forked off from that timeline at 0/590. LOG: startup process (PID 2384880) exited with exit code 1 Questions: 1) Why does mirror not switch to the next timeline when it get LSN in WAL, which there is in the history file? And why does mirror apply all WAL? 2) Why does 0004.history contain info about switching from 2 and 3 timelines even if the 4 timeline was switched from the first timeline? Thank you in advance. diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index 007571e948..e18eb3aace 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -1112,7 +1112,7 @@ instead return a true or false value to indicate success or failure. =cut -sub start +sub _start { my ($self, %params) = @_; my $port = $self->port; @@ -1150,11 +1150,23 @@ sub start BAIL_OUT("pg_ctl start failed") unless $params{fail_ok}; return 0; } +} +sub start +{ + my ($self) = @_; + $self->_start; $self->_update_pid(1); return 1; } +sub start_with_enabled_shutdown +{ + my ($self) = @_; + $self->_start; + return 1; +} + =pod =item $node->kill9() 050_check_replication.pl Description: Perl program
Re: How to recover correctly master and replica using backup made by pg_basebackup?
On 07/11/2024 22:19, Evgeniy Ratkov wrote:Hello.I try to use pg_basebackup to make backup and recover master and replicafrom it.I recover master with action "promote". Next, I recover replica from thesamebackup with action "shutdown". After it, I start replica with configuredconnectionand replication slot to the master. I get error, because replica readsand appliesall WAL of irrelevant timeline, which contains logs after restore point.I wrote Perl script for pg_basebackup's TAP tests. It is attached asfile "050_check_replication.pl".It works on PostgreSQL 17, commit7d85d87f4d5c35fd5b2d38adaef63dfbfa542ccc afterapplying small patch, which adds possibility to start node with shutdownoption.Patch is attached as file "add_node_start_with_shutdown.diff".Script generates situation when timeline changes several times. Smallschema mayexplain it:Timeline 1: Create the first backup Create restore point 'rp_repl_bug'Timeline 1 -> 2: Recovery with promote using the first backup and 'rp_repl_bug' Create the second backup Create restore point 'rp_repl_bug2'Timeline 2 -> 3: Recovery with promote using the second backup and 'rp_repl_bug2'Timeline 1 -> 4: Recovery with promote using the first backup and 'rp_repl_bug'After it I get "0004.history" file, which looks strange:1 0/390 at restore point "rp_repl_bug"2 0/590 at restore point "rp_repl_bug2"3 0/390 at restore point "rp_repl_bug"Next, I try to restore replica from the first backup. Replica did notfind WAL fortimeline 4 and downloaded and applied WAL for timeline 2.cp: cannot stat '${ARCHIVE_PATH}/00040003': No such fileor directoryLOG: redo starts at 0/228cp: cannot stat '${ARCHIVE_PATH}/00040003': No such fileor directory'${ARCHIVE_PATH}/00020003' ->'${REPLICA_PGDATA}/pg_wal/RECOVERYXLOG'LOG: restored log file "00020003" from archiveLOG: completed backup recovery with redo LSN 0/228 and end LSN0/2000120LOG: consistent recovery state reached at 0/2000120LOG: recovery stopping at restore point "rp_repl_bug", time 2024-11-0522:00:16.037126+03Than, at the next replica's start (with configured connection to themaster andreplication slot) replica can not start receiving WAL from master,because itapplied all WAL from timeline 2 including all lines after LSN, where itshouldswitch timeline:LOG: entering standby modeLOG: redo starts at 0/228LOG: consistent recovery state reached at 0/390LOG: database system is ready to accept read-only connectionsLOG: started streaming WAL from primary at 0/400 on timeline 4FATAL: could not receive data from WAL stream: ERROR: requestedstarting point 0/400 is ahead of the WAL flush position of thisserver 0/3002000If I restart replica, it will not start:LOG: database system was shut down in recovery at 2024-11-05 22:00:18 MSKLOG: entering standby modeFATAL: requested timeline 4 is not a child of this server's historyDETAIL: Latest checkpoint is at 0/390 on timeline 2, but in thehistory of the requested timeline, the server forked off from thattimeline at 0/590.LOG: startup process (PID 2384880) exited with exit code 1Questions:1) Why does mirror not switch to the next timeline when it get LSN inWAL, whichthere is in the history file? And why does mirror apply all WAL?2) Why does 0004.history contain info about switching from 2 and 3timelineseven if the 4 timeline was switched from the first timeline?Thank you in advance. There is Heikki Linnakangas's patch:0003-Don-t-read-past-current-TLI-during-archive-recovery.patch fromhttps://www.postgresql.org/message-id/fe7af87d-b9bf-481a-902c-089c49aa911d%40iki.fi.It fixes my problem with recovery backup on standby. I refactored the test, which may show the problem. The test passedwith Heikki Linnakangas's patch. Without patch it fails with error:"FATAL: could not receive data from WAL stream: ERROR: requestedstarting point 0/400 is ahead of the WAL flush position of thisserver 0/3002000". The test is attached as "050_check_recovery_backup.pl". I addedthe test to "src/test/recovery/t" directory. Thank you. --Evgeniy Ratkov 050_check_recovery_backup.pl Description: Binary data
Re: BUG #18575: Sometimes pg_rewind mistakenly assumes that nothing needs to be done.
On 08/09/2024 15:26, Heikki Linnakangas wrote: > 2. Independently of pg_rewind: When you start PostgreSQL, it will first > try to recover all the WAL it has locally in pg_wal. That goes wrong if > you have set a recovery target TLI. For example, imagine this situation: > > - Recovery target TLI is 2, set explicitly in postgresql.conf > - The switchpoint from TLI 1 to 2 happened at WAL position 0/1510198 > (the switchpoint is found in 0002.history) > - There is a WAL file 00010001 under pg_wal, which > contains valid WAL up to 0/159 > > When you start the server, it will first recover all the WAL from > 00010001, up to 0/159. Then it will connect to the > primary to fetch mor WAL, but it will fail to make any progress because > it already blew past the switch point. > > It's obviously wrong to replay the WAL from timeline 1 beyond the 1->2 > switchpoint, when the recovery target is TLI 2. The attached > 0003-Don-t-read-past-current-TLI-during-archive-recovery.patch fixes > that. However, the logic to find the right WAL segment file and read the > WAL is extremely complicated, and I don't feel comfortable that I got > all the cases right. Review would be highly appreciated. > > The patch includes a test case to demonstrate the case, with no > pg_rewind. It does include one "manual" step to copy a timeline history > file into pg_wal, marked with XXX, however. So I'm not sure how possible > this scenario is in production setups . Hello, Heikki Linnakangas. Your patch 0003-Don-t-read-past-current-TLI-during-archive-recovery.patch fixes the problem with recovery backup on standby, which I described at https://www.postgresql.org/message-id/acf3141b-c78d-4f28-8e15-92ed8144331e%40arenadata.io This thread also contains the test, which may show the problem. Thank you.