On 3/18/26 08:43, Michael Paquier wrote:
On Tue, Mar 17, 2026 at 11:50:29AM -0700, Haibo Yan wrote:
Thank you for the clarification. I have now read the code, and
overall it looks good to me. I only had one very small comment.

(Bottom-posting note from above, please be careful.)

I was just wondering whether the following might be slightly clearer:
```
memset(controlFile, 0, PG_CONTROL_FILE_SIZE);
memcpy(controlFile, ControlFile, sizeof(ControlFileData));
```

Yeah, perhaps I am being too clever here. The reason why I like this pattern:

memset(controlFile + sizeof(ControlFileData), 0,
       PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));

...is that valgrind will complain if the ControlFileData part is not completely initialized later on.

But your version is what is generally used in the code so I'm fine with doing it that way.

          {
              ControlFile->backupStartPoint = checkPoint.redo;
              ControlFile->backupEndRequired = backupEndRequired;
+            ControlFile->backupLabelRequired = false;

It sounds like it is going to be important to document the reason why
the flag is reset here (aka we don't need the backup_label file
anymore).

Good point -- I'll add that in the next revision.

+backup_control_file(uint8_t *controlFile)
+{
+    ControlFileData *controlData = ((ControlFileData *)controlFile);
+
+    memset(controlFile + sizeof(ControlFileData), 0,
+           PG_CONTROL_FILE_SIZE - sizeof(ControlFileData));
+
+    LWLockAcquire(ControlFileLock, LW_SHARED);
+    memcpy(controlFile, ControlFile, sizeof(ControlFileData));
+    LWLockRelease(ControlFileLock);
+
+    controlData->backupLabelRequired = true;
+
+    INIT_CRC32C(controlData->crc);
+    COMP_CRC32C(controlData->crc, controlFile, offsetof(ControlFileData, crc));
+    FIN_CRC32C(controlData->crc);

I was wondering if we should have an assertion at least to cross-check
that the contents we store in shared memory never go out-of-sync with
the on-disk contents, in the shape of a USE_ASSERT_CHECKING block that
calls get_controlfile() and memcmp()'s the contents between shmem and
the on-disk file, while the LWLock is taken.  We ship the control file
last on purpose, one reason being backups taken from standbys, so that
may be sensible to do.

As far as I can see this should always be true -- I audited all the

LWLockAcquire(ControlFileLock, LW_EXCLUSIVE)

sections and the file is always saved once if is updated. Let me see if I can add this check without too much pain, e.g. an additional parameter.

Another property of the new control file flag that is implied in the
implementation but not documented is that we should never check for
backupLabelRequired when a backup_label is gone.

I'm not sure what you mean here? That's exactly when we do want to check as below:

/*
    * No backup_label file has been found if we are here. Error if the
    * control file requires backup_label.
    */
if (ControlFile->backupLabelRequired)
    ereport(FATAL,
            (errmsg("could not find backup_label required for recovery"),
errhint("backup_label must be present for recovery to proceed")));


> Actually, the flag> is reset in InitWalRecovery() in the initial steps of recovery, and
the backup_label file is removed much later in StartupXLOG() just
*after* UpdateControlFile() to minimize the window where the contents
of the control file and the backup_label file is removed are
out-of-sync.  This window means that if we crash between the
completion of UpdateControlFile() and the durable_rename() we could
have a flag reset with a backup_label still around.  On restart,
recovery would fail, requiring a manual modification of the control
file, at least.  It sounds to me that this implementation detail
should be documented clearly.

I'll test it but I don't think this is the case. If backup_label is present then we'll just update pg_control again as we do now.

When backup_label is present the value of backupLabelRequired does not matter so it just follows the prior logic.

Finally, here is a general opinion.  I like this patch, and it is
basically risk-free for base backups taken with the replication
protocol as we update the control file with the new flag set
on-the-fly.

Glad to hear it!

Now, I am worried about backups that use pg_stop_backup().  Changing
backup APIs has always been a very sensitive area, and this is going
to require operators to update backup tools so as the control file
received as a result of pg_stop_backup() is copied, at the cost of
getting a failure if they don't do so.

Using the pg_control copy from pg_backup_stop() is entirely optional and nothing is broken if vendors decide not to use it. This can be demonstrated by applying the 01 patch without 02. In that case the tests in 042_low_level_backup continue to run. You can also apply 01 and 02 and revert the test changes in 042_low_level_backup and that works, too.

Vendors can use the new feature if they want the protection of backupLabelRequired and a guaranteed non-torn copy of pg_control but if they do nothing then nothing breaks.

I will *not* proceed with this
change without a clear approval from some more committers or senior
hackers that they like this change (approach previously suggested by
Andres, actually, for what I can see).

You are correct and this was an omission on my part. If this gets to commit we'll definitely want to mention that this flag was Andres' suggestion.

I am adding in CC a few
committers who have commented on this set of proposals and who have
touched the recovery code in the last few years, for awareness.
The timing is what it is, and we are at the end of a release cycle.
Let's see if we can reach a consensus of some kind.

Perfect. I'm looking forward to their input.

I'll hold off on a new patch version until we get some feedback since I don't think any of the requested changes are critical to the functionality.

Regards,
-David


Reply via email to