Hi,

On Fri, Mar 20, 2026 at 11:29 PM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> Do you think we need different GUCs for catalog_xmin and xmin? If table bloat 
> is a concern (not catalog bloat), then logical slots are not required to 
> invalidate unless the cluster is close to wraparound.

IMO the main purpose of max_slot_xid_age is to prevent XID wraparound.
For bloat, I still think max_slot_wal_keep_size is the better choice.

Where max_slot_xid_age is really useful is when the vacuum can't
freeze because a replication slot (physical or logical) is holding
back the XID horizon and the system is getting close to wraparound.
Invalidating such a slot clears the way for vacuum. Setting
max_slot_xid_age above vacuum_failsafe_age allows vacuum to waste
cycles scanning tables it cannot freeze. Keeping max_slot_xid_age <=
vacuum_failsafe_age (default 1.6B) prevents this by invalidating the
slot before vacuum effort is wasted.

As far as XID wraparound is concerned, both xmin and catalog_xmin need
to be treated similarly. Either one can hold back freezing and push
the system toward wraparound. So I don't think we need separate GUCs
for xmin and catalog_xmin unless I'm missing something. One GUC
covering both keeps things simple.

>> I made the following design choice: try invalidating only once per
>> vacuum cycle, not per table. While this keeps the cost of checking
>> (incl. the XidGenLock contention) for invalidation to a minimum when
>> there are a large number of tables and replication slots, it can be
>> less effective when individual tables/indexes are large. Invalidating
>> during checkpoints can help to some extent with the large table/index
>> cases. But I'm open to thoughts on this.
>
> It may not solve the intent when the vacuum cycle is longer, which one can 
> expect on a large database particularly when there is heavy bloat.

This design choice boils down to the following: a database instance
having either 1/ a large number of small tables or 2/ large tables.
>From my experience, I have seen both cases but mostly case 2 (others
can correct me). In this context, having an XID age based slot
invalidation check once per relation makes sense. However, I'm open to
more thoughts here.

>> Please find the attached patch for further review. I fixed the XID age
>> calculation in ReplicationSlotIsXIDAged and adjusted the code
>> comments.
>
> I applied the patch and all the tests passed. A few comments:

Thank you for reviewing the patch.

> @@ -495,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, 
> BufferAccessStrategy bstrateg
>     MemoryContext vac_context, bool isTopLevel)
>  {
>   static bool in_vacuum = false;
> -
> + static bool first_time = true;
>
> first_time variable is not self explanatory, maybe something like 
> try_replication_slot_invalidation and add comments that it will be set to 
> false after the first check?

+1. Changed the variable name and simplified the comments around.

> + if (TransactionIdIsValid(xmin))
> + appendStringInfo(&err_detail, _("The slot's xmin %u exceeds the maximum xid 
> age %d specified by \"max_slot_xid_age\"."),
> + xmin,
> + max_slot_xid_age);
>
> Slot invalidates even when the age is max_slot_xid_age, isn't it?

Nice catch! I changed it to use TransactionIdPrecedes so it matches
the above error message like the two of the existing XID age GUCs
(autovacuum_freeze_max_age, vacuum_failsafe_age).

Please find the attached v2 patch for further review. Thank you!

--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
From b458e4c46dc87793da49e9c529e412bc0934cdff Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Mon, 23 Mar 2026 13:14:47 +0000
Subject: [PATCH v2] Add XID age based replication slot invalidation

Introduce max_slot_xid_age, a GUC that invalidates replication
slots whose xmin or catalog_xmin exceeds the specified age.
Disabled by default.

Idle or forgotten replication slots can hold back vacuum, leading
to bloat and eventually XID wraparound. In the worst case this
requires dropping the slot and single-user mode vacuuming. This
setting avoids that by proactively invalidating slots that have
fallen too far behind.

Invalidation checks are performed during checkpoint and vacuum
(including autovacuum).
---
 doc/src/sgml/config.sgml                      |  26 ++
 doc/src/sgml/system-views.sgml                |   8 +
 src/backend/access/transam/xlog.c             |   4 +-
 src/backend/commands/vacuum.c                 |  50 +++-
 src/backend/replication/slot.c                |  93 ++++++-
 src/backend/utils/misc/guc_parameters.dat     |   8 +
 src/backend/utils/misc/postgresql.conf.sample |   2 +
 src/include/replication/slot.h                |   7 +-
 src/test/recovery/meson.build                 |   1 +
 .../t/060_invalidate_xid_aged_slots.pl        | 240 ++++++++++++++++++
 10 files changed, 432 insertions(+), 7 deletions(-)
 create mode 100644 src/test/recovery/t/060_invalidate_xid_aged_slots.pl

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 8cdd826fbd3..0cbcb254300 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4764,6 +4764,32 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-max-slot-xid-age" xreflabel="max_slot_xid_age">
+      <term><varname>max_slot_xid_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>max_slot_xid_age</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Invalidate replication slots whose <literal>xmin</literal> (the oldest
+        transaction that this slot needs the database to retain) or
+        <literal>catalog_xmin</literal> (the oldest transaction affecting the
+        system catalogs that this slot needs the database to retain) has reached
+        the age specified by this setting. A value of zero (which is default)
+        disables this feature. Users can set this value anywhere from zero to
+        two billion. This parameter can only be set in the
+        <filename>postgresql.conf</filename> file or on the server command
+        line.
+       </para>
+
+       <para>
+        This invalidation check happens either when the slot is acquired
+        for use or during vacuum or during checkpoint.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 9ee1a2bfc6a..1a507b430f9 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3102,6 +3102,14 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
           <xref linkend="guc-idle-replication-slot-timeout"/> duration.
          </para>
         </listitem>
+        <listitem>
+         <para>
+          <literal>xid_aged</literal> means that the slot's
+          <literal>xmin</literal> or <literal>catalog_xmin</literal>
+          has reached the age specified by
+          <xref linkend="guc-max-slot-xid-age"/> parameter.
+         </para>
+        </listitem>
        </itemizedlist>
       </para></entry>
      </row>
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f5c9a34374d..a87aa9c2bea 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7441,7 +7441,7 @@ CreateCheckPoint(int flags)
 	 */
 	XLByteToSeg(RedoRecPtr, _logSegNo, wal_segment_size);
 	KeepLogSeg(recptr, &_logSegNo);
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
@@ -7898,7 +7898,7 @@ CreateRestartPoint(int flags)
 
 	INJECTION_POINT("restartpoint-before-slot-invalidation", NULL);
 
-	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT,
+	if (InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_REMOVED | RS_INVAL_IDLE_TIMEOUT | RS_INVAL_XID_AGE,
 										   _logSegNo, InvalidOid,
 										   InvalidTransactionId))
 	{
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index bce3a2daa24..b9f3b657d87 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -48,6 +48,7 @@
 #include "postmaster/autovacuum.h"
 #include "postmaster/bgworker_internals.h"
 #include "postmaster/interrupt.h"
+#include "replication/slot.h"
 #include "storage/bufmgr.h"
 #include "storage/lmgr.h"
 #include "storage/pmsignal.h"
@@ -131,6 +132,7 @@ static bool vacuum_rel(Oid relid, RangeVar *relation, VacuumParams params,
 static double compute_parallel_delay(void);
 static VacOptValue get_vacoptval_from_boolean(DefElem *def);
 static bool vac_tid_reaped(ItemPointer itemptr, void *state);
+static void try_replication_slot_invalidation(void);
 
 /*
  * GUC check function to ensure GUC value specified is within the allowable
@@ -468,6 +470,34 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
 	MemoryContextDelete(vac_context);
 }
 
+/*
+ * Try invalidating replication slots based on current replication slot xmin
+ * limits once every vacuum cycle.
+ */
+static void
+try_replication_slot_invalidation(void)
+{
+	TransactionId min_slot_xmin = InvalidTransactionId;
+	TransactionId min_slot_catalog_xmin = InvalidTransactionId;
+
+	if (max_slot_xid_age == 0)
+		return;
+
+	ProcArrayGetReplicationSlotXmin(&min_slot_xmin, &min_slot_catalog_xmin);
+
+	if (ReplicationSlotIsXIDAged(min_slot_xmin, min_slot_catalog_xmin))
+	{
+		/*
+		 * Note that InvalidateObsoleteReplicationSlots is also called as part
+		 * of CHECKPOINT, and emitting ERRORs from within is avoided already.
+		 * Therefore, there is no concern here that any ERROR from
+		 * invalidating replication slots blocks VACUUM.
+		 */
+		InvalidateObsoleteReplicationSlots(RS_INVAL_XID_AGE, 0,
+										   InvalidOid, InvalidTransactionId);
+	}
+}
+
 /*
  * Internal entry point for autovacuum and the VACUUM / ANALYZE commands.
  *
@@ -495,7 +525,7 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg
 	   MemoryContext vac_context, bool isTopLevel)
 {
 	static bool in_vacuum = false;
-
+	static bool try_slot_invalidation = true;
 	const char *stmttype;
 	volatile bool in_outer_xact,
 				use_own_xacts;
@@ -608,6 +638,24 @@ vacuum(List *relations, const VacuumParams params, BufferAccessStrategy bstrateg
 		CommitTransactionCommand();
 	}
 
+	if (params.options & VACOPT_VACUUM)
+	{
+		/*
+		 * Try to invalidate replication slots whose xmin or catalog_xmin
+		 * exceeds max_slot_xid_age. We do this once per vacuum cycle, not
+		 * per-relation. Checkpoints also perform this check.
+		 *
+		 * Each autovacuum worker checks only on its first vacuum() call.
+		 *
+		 * VACUUM command always checks.
+		 */
+		if (try_slot_invalidation)
+			try_replication_slot_invalidation();
+
+		if (AmAutoVacuumWorkerProcess())
+			try_slot_invalidation = false;
+	}
+
 	/* Turn vacuum cost accounting on or off, and set/clear in_vacuum */
 	PG_TRY();
 	{
diff --git a/src/backend/replication/slot.c b/src/backend/replication/slot.c
index a9092fc2382..2bf1d09da85 100644
--- a/src/backend/replication/slot.c
+++ b/src/backend/replication/slot.c
@@ -117,6 +117,7 @@ static const SlotInvalidationCauseMap SlotInvalidationCauses[] = {
 	{RS_INVAL_HORIZON, "rows_removed"},
 	{RS_INVAL_WAL_LEVEL, "wal_level_insufficient"},
 	{RS_INVAL_IDLE_TIMEOUT, "idle_timeout"},
+	{RS_INVAL_XID_AGE, "xid_aged"},
 };
 
 /*
@@ -158,6 +159,12 @@ int			max_replication_slots = 10; /* the maximum number of replication
  */
 int			idle_replication_slot_timeout_secs = 0;
 
+/*
+ * Invalidate replication slots that have xmin or catalog_xmin greater
+ * than the specified age; '0' disables it.
+ */
+int			max_slot_xid_age = 0;
+
 /*
  * This GUC lists streaming replication standby server slot names that
  * logical WAL sender processes will wait for.
@@ -1780,7 +1787,9 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 					   XLogRecPtr restart_lsn,
 					   XLogRecPtr oldestLSN,
 					   TransactionId snapshotConflictHorizon,
-					   long slot_idle_seconds)
+					   long slot_idle_seconds,
+					   TransactionId xmin,
+					   TransactionId catalog_xmin)
 {
 	StringInfoData err_detail;
 	StringInfoData err_hint;
@@ -1825,6 +1834,34 @@ ReportSlotInvalidation(ReplicationSlotInvalidationCause cause,
 								 "idle_replication_slot_timeout");
 				break;
 			}
+
+		case RS_INVAL_XID_AGE:
+			{
+				Assert(TransactionIdIsValid(xmin) || TransactionIdIsValid(catalog_xmin));
+
+				if (TransactionIdIsValid(xmin))
+				{
+					/* translator: %s is a GUC variable name */
+					appendStringInfo(&err_detail, _("The slot's xmin age %u exceeds the age %d specified by \"%s\"."),
+									 xmin,
+									 max_slot_xid_age,
+									 "max_slot_xid_age");
+				}
+				else
+				{
+					/* translator: %s is a GUC variable name */
+					appendStringInfo(&err_detail, _("The slot's catalog_xmin age %u exceeds the age %d specified by \"%s\"."),
+									 catalog_xmin,
+									 max_slot_xid_age,
+									 "max_slot_xid_age");
+				}
+
+				/* translator: %s is a GUC variable name */
+				appendStringInfo(&err_hint, _("You might need to increase \"%s\"."),
+								 "max_slot_xid_age");
+				break;
+			}
+
 		case RS_INVAL_NONE:
 			pg_unreachable();
 	}
@@ -1945,6 +1982,16 @@ DetermineSlotInvalidationCause(uint32 possible_causes, ReplicationSlot *s,
 		}
 	}
 
+	if (possible_causes & RS_INVAL_XID_AGE)
+	{
+		/*
+		 * Safe since we hold the replication slot's spinlock needed to avoid
+		 * race conditions
+		 */
+		if (ReplicationSlotIsXIDAged(s->data.xmin, s->data.catalog_xmin))
+			return RS_INVAL_XID_AGE;
+	}
+
 	return RS_INVAL_NONE;
 }
 
@@ -2112,7 +2159,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 				ReportSlotInvalidation(invalidation_cause, true, active_pid,
 									   slotname, restart_lsn,
 									   oldestLSN, snapshotConflictHorizon,
-									   slot_idle_secs);
+									   slot_idle_secs, s->data.xmin, s->data.catalog_xmin);
 
 				if (MyBackendType == B_STARTUP)
 					(void) SignalRecoveryConflict(GetPGProcByNumber(active_proc),
@@ -2165,7 +2212,7 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
 			ReportSlotInvalidation(invalidation_cause, false, active_pid,
 								   slotname, restart_lsn,
 								   oldestLSN, snapshotConflictHorizon,
-								   slot_idle_secs);
+								   slot_idle_secs, s->data.xmin, s->data.catalog_xmin);
 
 			/* done with this slot for now */
 			break;
@@ -2192,6 +2239,8 @@ InvalidatePossiblyObsoleteSlot(uint32 possible_causes,
  *   logical.
  * - RS_INVAL_IDLE_TIMEOUT: has been idle longer than the configured
  *   "idle_replication_slot_timeout" duration.
+ * - RS_INVAL_XID_AGE: slot xid age is older than the configured
+ *   "max_slot_xid_age" age.
  *
  * Note: This function attempts to invalidate the slot for multiple possible
  * causes in a single pass, minimizing redundant iterations. The "cause"
@@ -3275,3 +3324,41 @@ WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn)
 
 	ConditionVariableCancelSleep();
 }
+
+/*
+ * Check true if the given passed in xmin or catalog_xmin age is
+ * older than the age specified by max_slot_xid_age.
+ */
+bool
+ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin)
+{
+	TransactionId cutoff;
+	TransactionId curr;
+	bool		is_aged = false;
+
+	if (max_slot_xid_age == 0)
+		return false;
+
+	curr = ReadNextTransactionId();
+
+	/*
+	 * Calculate oldest XID a slot's xmin or catalog_xmin can have before they
+	 * are invalidated.
+	 */
+	cutoff = curr - max_slot_xid_age;
+
+	/* ensure it's a "normal" XID, else TransactionIdPrecedes misbehaves */
+	/* this can cause the limit to go backwards by 3, but that's OK */
+	if (cutoff < FirstNormalTransactionId)
+		cutoff -= FirstNormalTransactionId;
+
+	if (TransactionIdIsNormal(xmin) &&
+		TransactionIdPrecedes(xmin, cutoff))
+		is_aged = true;
+
+	if (TransactionIdIsNormal(catalog_xmin) &&
+		TransactionIdPrecedes(catalog_xmin, cutoff))
+		is_aged = true;
+
+	return is_aged;
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0c9854ad8fc..7fab02f1eeb 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -2049,6 +2049,14 @@
   max => 'MAX_KILOBYTES',
 },
 
+{ name => 'max_slot_xid_age', type => 'int', context => 'PGC_SIGHUP', group => 'REPLICATION_SENDING',
+  short_desc => 'Age of the transaction ID at which a replication slot gets invalidated.',
+  variable => 'max_slot_xid_age',
+  boot_val => '0',
+  min => '0',
+  max => '2000000000',
+},
+
 # We use the hopefully-safely-small value of 100kB as the compiled-in
 # default for max_stack_depth.  InitializeGUCOptions will increase it
 # if possible, depending on the actual platform-specific stack limit.
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index e4abe6c0077..0f728d87b6c 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -351,6 +351,8 @@
 #wal_keep_size = 0              # in megabytes; 0 disables
 #max_slot_wal_keep_size = -1    # in megabytes; -1 disables
 #idle_replication_slot_timeout = 0      # in seconds; 0 disables
+#max_slot_xid_age = 0           # maximum XID age before a replication slot
+                                # gets invalidated; 0 disables
 #wal_sender_timeout = 60s       # in milliseconds; 0 disables
 #track_commit_timestamp = off   # collect timestamp of transaction commit
                                 # (change requires restart)
diff --git a/src/include/replication/slot.h b/src/include/replication/slot.h
index 4b4709f6e2c..dce3f0f51b0 100644
--- a/src/include/replication/slot.h
+++ b/src/include/replication/slot.h
@@ -66,10 +66,12 @@ typedef enum ReplicationSlotInvalidationCause
 	RS_INVAL_WAL_LEVEL = (1 << 2),
 	/* idle slot timeout has occurred */
 	RS_INVAL_IDLE_TIMEOUT = (1 << 3),
+	/* slot's xmin or catalog_xmin has reached max xid age */
+	RS_INVAL_XID_AGE = (1 << 4),
 } ReplicationSlotInvalidationCause;
 
 /* Maximum number of invalidation causes */
-#define	RS_INVAL_MAX_CAUSES 4
+#define	RS_INVAL_MAX_CAUSES 5
 
 /*
  * When the slot synchronization worker is running, or when
@@ -326,6 +328,7 @@ extern PGDLLIMPORT ReplicationSlot *MyReplicationSlot;
 extern PGDLLIMPORT int max_replication_slots;
 extern PGDLLIMPORT char *synchronized_standby_slots;
 extern PGDLLIMPORT int idle_replication_slot_timeout_secs;
+extern PGDLLIMPORT int max_slot_xid_age;
 
 /* shmem initialization functions */
 extern Size ReplicationSlotsShmemSize(void);
@@ -387,4 +390,6 @@ extern bool SlotExistsInSyncStandbySlots(const char *slot_name);
 extern bool StandbySlotsHaveCaughtup(XLogRecPtr wait_for_lsn, int elevel);
 extern void WaitForStandbyConfirmation(XLogRecPtr wait_for_lsn);
 
+extern bool ReplicationSlotIsXIDAged(TransactionId xmin, TransactionId catalog_xmin);
+
 #endif							/* SLOT_H */
diff --git a/src/test/recovery/meson.build b/src/test/recovery/meson.build
index 36d789720a3..31bafcc6c07 100644
--- a/src/test/recovery/meson.build
+++ b/src/test/recovery/meson.build
@@ -61,6 +61,7 @@ tests += {
       't/050_redo_segment_missing.pl',
       't/051_effective_wal_level.pl',
       't/052_checkpoint_segment_missing.pl',
+      't/060_invalidate_xid_aged_slots.pl',
     ],
   },
 }
diff --git a/src/test/recovery/t/060_invalidate_xid_aged_slots.pl b/src/test/recovery/t/060_invalidate_xid_aged_slots.pl
new file mode 100644
index 00000000000..f1e8f003f95
--- /dev/null
+++ b/src/test/recovery/t/060_invalidate_xid_aged_slots.pl
@@ -0,0 +1,240 @@
+# Copyright (c) 2025, PostgreSQL Global Development Group
+
+# Test for replication slots invalidation due to XID age
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::BackgroundPsql;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+# Wait for slot to first become inactive and then get invalidated
+sub wait_for_slot_invalidation
+{
+	my ($node, $slot_name, $reason) = @_;
+	my $name = $node->name;
+
+	# Wait for the inactive replication slot to be invalidated
+	$node->poll_query_until(
+		'postgres', qq[
+		SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+			WHERE slot_name = '$slot_name' AND
+			invalidation_reason = '$reason';
+	])
+	  or die
+	  "Timed out while waiting for inactive slot $slot_name to be invalidated on node $name";
+}
+
+# Do some work for advancing xids on a given node
+sub advance_xids
+{
+	my ($node, $table_name) = @_;
+
+	$node->safe_psql(
+		'postgres', qq[
+		do \$\$
+		begin
+		for i in 10000..11000 loop
+			-- use an exception block so that each iteration eats an XID
+			begin
+			insert into $table_name values (i);
+			exception
+			when division_by_zero then null;
+			end;
+		end loop;
+		end\$\$;
+	]);
+}
+
+# =============================================================================
+# Testcase start: Invalidate streaming standby's slot due to max_slot_xid_age
+# GUC.
+
+# Initialize primary node
+my $primary = PostgreSQL::Test::Cluster->new('primary');
+$primary->init(allows_streaming => 'logical');
+
+# Configure primary with XID age settings
+$primary->append_conf(
+	'postgresql.conf', qq{
+max_slot_xid_age = 500
+});
+
+$primary->start;
+
+# Take a backup for creating standby
+my $backup_name = 'backup';
+$primary->backup($backup_name);
+
+# Create a standby linking to the primary using the replication slot
+my $standby = PostgreSQL::Test::Cluster->new('standby');
+$standby->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+# Enable hs_feedback. The slot should gain an xmin. We set the status interval
+# so we'll see the results promptly.
+$standby->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'sb_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+max_standby_streaming_delay = 3600000
+});
+
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'sb_slot', immediately_reserve := true);
+]);
+
+$standby->start;
+
+# Create some content on primary to move xmin
+$primary->safe_psql('postgres',
+	"CREATE TABLE tab_int AS SELECT generate_series(1,10) AS a");
+
+# Wait until standby has replayed enough data
+$primary->wait_for_catchup($standby);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL)
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'sb_slot';
+]) or die "Timed out waiting for slot sb_slot xmin to advance";
+
+# Stop standby to make the replication slot's xmin on primary to age
+
+# Read on standby that causes xmin to be held on slot
+my $standby_session = $standby->interactive_psql('postgres');
+$standby_session->query("BEGIN; SET default_transaction_isolation = 'repeatable read'; SELECT * FROM tab_int;");
+
+#$standby->stop;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+$primary->safe_psql('postgres', "CHECKPOINT");
+wait_for_slot_invalidation($primary, 'sb_slot', 'xid_aged');
+
+$standby_session->quit;
+$standby->stop;
+
+# Testcase end: Invalidate streaming standby's slot due to max_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Invalidate logical subscriber's slot due to max_slot_xid_age
+# GUC.
+
+# Create a subscriber node
+my $subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$subscriber->init(allows_streaming => 'logical');
+$subscriber->start;
+
+# Create tables on both primary and subscriber
+$primary->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+$subscriber->safe_psql('postgres', "CREATE TABLE test_tbl (id int)");
+
+# Insert some initial data
+$primary->safe_psql('postgres',
+	"INSERT INTO test_tbl VALUES (generate_series(1, 5));");
+
+# Setup logical replication
+my $publisher_connstr = $primary->connstr . ' dbname=postgres';
+$primary->safe_psql('postgres',
+	"CREATE PUBLICATION pub FOR TABLE test_tbl");
+
+$subscriber->safe_psql('postgres',
+	"CREATE SUBSCRIPTION sub CONNECTION '$publisher_connstr' PUBLICATION pub WITH (slot_name = 'lsub_slot')"
+);
+
+# Wait for initial sync to complete
+$subscriber->wait_for_subscription_sync($primary, 'sub');
+
+my $result = $subscriber->safe_psql('postgres', "SELECT count(*) FROM test_tbl");
+is($result, qq(5), "check initial copy was done for logical replication");
+
+# Wait for the logical slot to get catalog_xmin (logical slots use catalog_xmin, not xmin)
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT xmin IS NULL AND catalog_xmin IS NOT NULL
+	FROM pg_catalog.pg_replication_slots
+	WHERE slot_name = 'lsub_slot';
+]) or die "Timed out waiting for slot lsub_slot catalog_xmin to advance";
+
+# Stop subscriber to make the replication slot on primary inactive
+$subscriber->stop;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'test_tbl');
+
+# Wait for the replication slot to become inactive and then invalidated due to
+# XID age.
+$primary->safe_psql('postgres', "CHECKPOINT");
+wait_for_slot_invalidation($primary, 'lsub_slot', 'xid_aged');
+
+# Testcase end: Invalidate logical subscriber's slot due to max_slot_xid_age
+# GUC.
+# =============================================================================
+
+# =============================================================================
+# Testcase start: Test VACUUM command triggering slot invalidation
+#
+
+# Create another physical replication slot for VACUUM test
+$primary->safe_psql(
+	'postgres', qq[
+    SELECT pg_create_physical_replication_slot(slot_name := 'vacuum_test_slot', immediately_reserve := true);
+]);
+
+# Create a new standby for this test
+my $standby_vacuum = PostgreSQL::Test::Cluster->new('standby_vacuum');
+$standby_vacuum->init_from_backup($primary, $backup_name, has_streaming => 1);
+
+$standby_vacuum->append_conf(
+	'postgresql.conf', q{
+primary_slot_name = 'vacuum_test_slot'
+hot_standby_feedback = on
+wal_receiver_status_interval = 1
+});
+
+$standby_vacuum->start;
+
+# Wait until standby has replayed enough data and slot gets xmin
+$primary->wait_for_catchup($standby_vacuum);
+
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT (xmin IS NOT NULL) OR (catalog_xmin IS NOT NULL)
+		FROM pg_catalog.pg_replication_slots
+		WHERE slot_name = 'vacuum_test_slot';
+]) or die "Timed out waiting for slot vacuum_test_slot xmin to advance";
+
+# Stop standby to make the replication slot's xmin on primary to age
+$standby_vacuum->stop;
+
+# Do some work to advance xids on primary
+advance_xids($primary, 'tab_int');
+
+# Use VACUUM to trigger slot invalidation (instead of CHECKPOINT)
+# This tests that VACUUM command can trigger XID age invalidation
+$primary->safe_psql('postgres', "VACUUM");
+
+# Wait for the replication slot to become invalidated due to XID age triggered by VACUUM
+$primary->poll_query_until(
+	'postgres', qq[
+	SELECT COUNT(slot_name) = 1 FROM pg_replication_slots
+		WHERE slot_name = 'vacuum_test_slot' AND
+		invalidation_reason = 'xid_aged';
+])
+  or die "Timed out while waiting for slot vacuum_test_slot to be invalidated by VACUUM";
+
+# Testcase end: Test VACUUM command triggering slot invalidation
+# =============================================================================
+
+ok(1, "all XID age invalidation tests completed successfully");
+
+done_testing();
-- 
2.47.3

Reply via email to