Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE

Michael Paquier Wed, 18 Mar 2026 17:54:29 -0700

On Wed, Mar 18, 2026 at 03:52:02PM -0400, Tom Lane wrote:
> which makes me wonder whether the problematic session is the second or
> third bgworker.  I am not seeing entries indicating that those
> stopped, as there is for the first bgworker.

Looking at the logs produced at [1], the worker launched as number 1
would not be able to interact, it connects to the database postgres,
under PID 1616001, and is reported as exited by the postmaster.

The only interacting sessions would be:
1) The bgworker launched as number 2, connected to database testdb.
2) The session checking for pg_stat_activity, launched by
launch_bgworker(). The test was connected with the database we want
to rename, and this could interact as an extra session. This query
could be run while connected to the database postgres to reduce the
friction and discarding this one.

The timestamps of the logs tell that it takes 5 seconds for this host
to get out of the ALTER DATABASE .. RENAME TO, which implies that we
are looping inside CountOtherDBBackends() for 5 seconds. So it really
looks like the second bgworker is the one we are waiting for here.
Now, we are sure of the following things when we try to launch the
RENAME TO:
- The worker is seen in pg_stat_activity.
- The worker is already in worker_spi_main(), per its "LOG initialized
with" entry.
- The worker is connected to the database.
- The worker can receive signals.

How would it be possible for this worker to not receive the requests?
The only thing I could think of is that the postmaster does not have
the time to process the PMSIGNAL_BACKGROUND_WORKER_CHANGE requests?

The next thing would be to gather more data, I guess. The attached
would help in providing more information. If it happens that we are
able to send the requests and that the postmaster does not have the
time to process them, I don't really see what we can do except:
- Drop the portion of the tests for DROP DATABASE, SET TABLESPACE and
RENAME DB, because all these scenarios involve commands that work on
the same database as the worker connected, and if the postmaster does
not have the time to process the termination requests, I don't really
see what we could do. This could also point to a timing issue with
the feature in itself, of course.
- Revert the feature, stop playing with the buildfarm due to the end
of the release cycle, and rework it for v20.

For now I am planning for the attached to get more information from
widowbird, which should take a few days at worst. That would make
clear if we have a timing issue with the requests sent to the
postmaster. Launching the queries for worker_spi_launch() and
pg_stat_activity on the database postgres may also improve things, but
I don't really buy it, even if I may be wrong.

[1]:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=widowbird&dt=2026-03-17%2015%3A35%3A03
--
Michael

From adb474866f12a66ac48c702d9ab25ce82277abca Mon Sep 17 00:00:00 2001
From: Michael Paquier <[email protected]>
Date: Thu, 19 Mar 2026 09:51:16 +0900
Subject: [PATCH] Add more debugging information for termination tests of
 worker_spi

---
 src/backend/postmaster/bgworker.c                     |  6 ++++++
 src/test/modules/worker_spi/t/002_worker_terminate.pl | 10 ++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/src/backend/postmaster/bgworker.c 
b/src/backend/postmaster/bgworker.c
index 0104a86b9ecd..fd678ef2596d 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -1413,6 +1413,9 @@ TerminateBackgroundWorkersForDatabase(Oid databaseId)
 {
        bool            signal_postmaster = false;
 
+       elog(DEBUG1, "attempting worker termination for database %u",
+                databaseId);
+
        LWLockAcquire(BackgroundWorkerLock, LW_EXCLUSIVE);
 
        /*
@@ -1432,6 +1435,9 @@ TerminateBackgroundWorkersForDatabase(Oid databaseId)
                        {
                                slot->terminate = true;
                                signal_postmaster = true;
+
+                               elog(DEBUG1, "termination requested for worker 
(PID %d) on database %u",
+                                        slot->pid, databaseId);
                        }
                }
        }
diff --git a/src/test/modules/worker_spi/t/002_worker_terminate.pl 
b/src/test/modules/worker_spi/t/002_worker_terminate.pl
index 6db80ffec88c..b0e6a5376d4c 100644
--- a/src/test/modules/worker_spi/t/002_worker_terminate.pl
+++ b/src/test/modules/worker_spi/t/002_worker_terminate.pl
@@ -24,7 +24,7 @@ sub launch_bgworker
 
        # Launch a background worker on the given database.
        my $pid = $node->safe_psql(
-               $database, qq(
+               'postgres', qq(
         SELECT worker_spi_launch($testcase, '$database'::regdatabase, 0, '{}', 
$interruptible);
     ));
 
@@ -32,7 +32,7 @@ sub launch_bgworker
        $node->wait_for_log(
                qr/LOG: .*worker_spi dynamic worker $testcase initialized with 
.*\..*/,
                $offset);
-       my $result = $node->safe_psql($database,
+       my $result = $node->safe_psql('postgres',
                "SELECT count(*) > 0 FROM pg_stat_activity WHERE pid = $pid;");
        is($result, 't', "dynamic bgworker $testcase launched");
 
@@ -52,6 +52,11 @@ sub run_bgworker_interruptible_test
                qr/terminating background worker \"worker_spi dynamic\" due to 
administrator command/,
                $offset);
 
+       # Postmaster entry reporting the worker as exiting.
+       $node->wait_for_log(
+               qr/LOG: .*background worker \"worker_spi dynamic\" \(PID $pid\) 
exited with exit code/,
+               $offset);
+
        my $result = $node->safe_psql('postgres',
                "SELECT count(*) = 0 FROM pg_stat_activity WHERE pid = $pid;");
        is($result, 't', "dynamic bgworker stopped for $testname");
@@ -63,6 +68,7 @@ $node->append_conf(
        "postgresql.conf", qq(
 autovacuum = off
 debug_parallel_query = off
+log_min_messages = debug1
 ));
 $node->start;
 
-- 
2.53.0

signature.asc
Description: PGP signature

Re: [PROPOSAL] Termination of Background Workers for ALTER/DROP DATABASE

Reply via email to