Hi All,

I ran into this issue when pg_upgrade-ing a DB with PostGIS. This is my
first code patch, so any feedback on the approach will be appreciated!

The problem:
============
pg_upgrade uses pg_dump --schema-only --binary-upgrade to copy the schema
from $oldcluster to $newcluster. Because this excludes all table data, it
leaves out data in extension config tables registered with
pg_extension_config_dump().

In $newcluster, binary_upgrade_create_empty_extension() creates the
extensions without populating any table data. The extensions' CREATE
EXTENSION scripts never get executed so any INSERTs are skipped. As a
consequence, if any CREATE TABLE statement in $newcluster requires
validation against these empty config tables, the upgrade fails. As an
example,
PostGIS registers config table spatial_ref_sys to hold ~8500 spatial
reference system definitions (SRIDs). When a table has, e.g. a geometry
column that specifies an SRID, this gets validated during the CREATE TABLE:

CREATE TABLE points (id int, location geometry(Point, 27700));
ERROR:  Cannot find SRID (27700) in spatial_ref_sys

This will happen for any SRID-constrained column, which will prevent many
real-world PostGIS deployments from being able to pg_upgrade. To summarise
the problem, our ordering is wrong here because extension configuration
data must be present before user tables that depend on it get created, but
--schema-only strips this data.

The patch:
==========
We are adding a new dump object type DO_EXTENSION_DATA that dumps extension
config table data in SECTION_PRE_DATA during --binary-upgrade ONLY. This
restores the needed data between extension creation and user object
creation, allowing the DDL to succeed.

Four files are modified in bin/pg_dump:

pg_dump.h:
Add DO_EXTENSION_DATA to the DumpableObjectType enum, between DO_EXTENSION
and DO_TYPE

pg_dump_sort.c:
Add PRIO_EXTENSION_DATA between PRIO_EXTENSION and PRIO_TYPE

pg_dump.c:
1. Add makeExtensionDataInfo() to create a TableDataInfo with objType =
DO_EXTENSION_DATA. Called for plain tables (RELKIND_RELATION) during
--binary-upgrade ONLY. As it depends on the table def, the COPY will be
emitted after the CREATE TABLE.
2. Add dumpExtensionData() to emit the entry in SECTION_PRE_DATA with
description "EXTENSION DATA" using dumpTableData_copy(). This allows the
config table data to go into the schema-only dump.
3. In processExtensionTables(), when dopt->binary_upgrade is true, call
makeExtensionDataInfo() instead of makeTableDataInfo(). Additionally, skip
extcondition filter because we need to dump all rows here.
4. Include DO_EXTENSION_DATA in pre-data boundary in
addBoundaryDependencies()

pg_backup_archiver.c:
Add "EXTENSION DATA" to the whitelist in _tocEntryRequired() similar to
BLOB, BLOB METADATA, etc. to include extension config table data in
--schema-only dumps during --binary-upgrade ONLY.

What ends up happening:
=======================
The inserted rows are basically scaffolding to allow the upgrade, and do
not persist. The pg_upgrade sequence goes like:
1. pg_dump includes $oldcluster extension config data in schema-only dump
2. pg_restore replays the dump into $newcluster and "EXTENSION DATA"
entries populate tables like spatial_ref_sys with COPY. Subsequent CREATE
TABLEs with e.g. SRID-constrained columns pass validation.
3. pg_upgrade transfers all data files from $oldcluster to $newcluster,
making spatial_ref_sys byte-for-byte identical to its previous state.

This patch:
1. Does NOT affect normal pg_dumps (without --binary-upgrade).
DO_EXTENSION_DATA objects are not created in this case.
2. Leaves binary_upgrade_create_empty_extension() unchanged.
3. Is not PostGIS-specific, and should solve this class of problem for any
extension that registers config tables that will be needed for DDL
validation.
4. Has been tested against HEAD at 29bf4ee7496 with $oldcluster PostGIS
3.3.9 on PG14 and $newcluster PostGIS 3.7.0dev/master on PG19-devel.

Thanks in advance for your review! Please find attached the patch for HEAD.
I believe this should be easily backpatchable to (at least) PG15, and will
be happy to work on backports.

Best regards,
Jimmy
From 29019b8638b28a487b66c7ad01893a2de40ff79d Mon Sep 17 00:00:00 2001
From: Jimmy Angelakos <[email protected]>
Date: Fri, 20 Mar 2026 16:04:45 +0000
Subject: [PATCH] pg_dump: Restore extension config table data before user
 objects during binary upgrade

pg_upgrade uses pg_dump --schema-only --binary-upgrade, which excludes
all table data including extension configuration tables registered via
pg_extension_config_dump(). Since binary_upgrade_create_empty_extension()
does not populate these tables, any user table whose CREATE TABLE
triggers validation against config data will fail.

For example, PostGIS tables with SRID-constrained geometry/geography
columns fail because spatial_ref_sys is empty during schema restore.

Fix by introducing a new dump object type DO_EXTENSION_DATA that dumps
extension config table data into SECTION_PRE_DATA during binary upgrade.
This puts the data restore between extension creation and user object
creation, allowing DDL-time validation to succeed. The data is
scaffolding: it is overwritten when pg_upgrade transfers the old
cluster's data files to the new cluster.

This is not PostGIS-specific and applies to any extension that registers
config tables via pg_extension_config_dump() where that data is needed
for DDL-time validation.
---
 src/bin/pg_dump/pg_backup_archiver.c |   2 +
 src/bin/pg_dump/pg_dump.c            | 109 ++++++++++++++++++++++++++-
 src/bin/pg_dump/pg_dump.h            |   1 +
 src/bin/pg_dump/pg_dump_sort.c       |   7 ++
 4 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 271a2c3e481..c9f9e574f16 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -3309,6 +3309,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		 */
 		if (strcmp(te->desc, "SEQUENCE SET") == 0 ||
 			strcmp(te->desc, "BLOB") == 0 ||
+			strcmp(te->desc, "EXTENSION DATA") == 0 ||
 			strcmp(te->desc, "BLOB METADATA") == 0 ||
 			(strcmp(te->desc, "ACL") == 0 &&
 			 strncmp(te->tag, "LARGE OBJECT", 12) == 0) ||
@@ -3350,6 +3351,7 @@ _tocEntryRequired(TocEntry *te, teSection curSection, ArchiveHandle *AH)
 		if (!(ropt->sequence_data && strcmp(te->desc, "SEQUENCE SET") == 0) &&
 			!(ropt->binary_upgrade &&
 			  (strcmp(te->desc, "BLOB") == 0 ||
+			   strcmp(te->desc, "EXTENSION DATA") == 0 ||
 			   strcmp(te->desc, "BLOB METADATA") == 0 ||
 			   (strcmp(te->desc, "ACL") == 0 &&
 				strncmp(te->tag, "LARGE OBJECT", 12) == 0) ||
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index ad09677c336..2ffb8eb863f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -264,6 +264,7 @@ static void prohibit_crossdb_refs(PGconn *conn, const char *dbname,
 static NamespaceInfo *findNamespace(Oid nsoid);
 static void dumpTableData(Archive *fout, const TableDataInfo *tdinfo);
 static void refreshMatViewData(Archive *fout, const TableDataInfo *tdinfo);
+static void dumpExtensionData(Archive *fout, const TableDataInfo *tdinfo);
 static const char *getRoleName(const char *roleoid_str);
 static void collectRoleNames(Archive *fout);
 static void getAdditionalACLs(Archive *fout);
@@ -352,6 +353,7 @@ static void addConstrChildIdxDeps(DumpableObject *dobj, const IndxInfo *refidx);
 static void getDomainConstraints(Archive *fout, TypeInfo *tyinfo);
 static void getTableData(DumpOptions *dopt, TableInfo *tblinfo, int numTables, char relkind);
 static void makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo);
+static void makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo);
 static void buildMatViewRefreshDependencies(Archive *fout);
 static void getTableDataFKConstraints(void);
 static void determineNotNullFlags(Archive *fout, PGresult *res, int r,
@@ -2961,6 +2963,48 @@ dumpTableData(Archive *fout, const TableDataInfo *tdinfo)
 	destroyPQExpBuffer(clistBuf);
 }
 
+/*
+ * dumpExtensionData -
+ *	  dump extension configuration table data for binary upgrade
+ *
+ * Entry goes into SECTION_PRE_DATA so the data is available before
+ * user tables that may need it for validation.
+ */
+static void
+dumpExtensionData(Archive *fout, const TableDataInfo *tdinfo)
+{
+	TableInfo  *tbinfo = tdinfo->tdtable;
+	PQExpBuffer copyBuf = createPQExpBuffer();
+	PQExpBuffer clistBuf = createPQExpBuffer();
+
+	/* Check that we have per-column details about this table */
+	Assert(tbinfo->interesting);
+
+	/* Build COPY statement */
+	printfPQExpBuffer(copyBuf, "COPY %s ",
+					  fmtQualifiedDumpable(tbinfo));
+	appendPQExpBuffer(copyBuf, "%s FROM stdin;\n",
+					  fmtCopyColumnList(tbinfo, clistBuf));
+
+	if (tdinfo->dobj.dump & DUMP_COMPONENT_DATA)
+	{
+		ArchiveEntry(fout, tdinfo->dobj.catId, tdinfo->dobj.dumpId,
+					 ARCHIVE_OPTS(.tag = tbinfo->dobj.name,
+								  .namespace = tbinfo->dobj.namespace->dobj.name,
+								  .owner = tbinfo->rolname,
+								  .description = "EXTENSION DATA",
+								  .section = SECTION_PRE_DATA,
+								  .copyStmt = copyBuf->data,
+								  .deps = &(tbinfo->dobj.dumpId),
+								  .nDeps = 1,
+								  .dumpFn = dumpTableData_copy,
+								  .dumpArg = tdinfo));
+	}
+
+	destroyPQExpBuffer(copyBuf);
+	destroyPQExpBuffer(clistBuf);
+}
+
 /*
  * refreshMatViewData -
  *	  load or refresh the contents of a single materialized view
@@ -3105,6 +3149,48 @@ makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
 	tbinfo->interesting = true;
 }
 
+/*
+ * makeExtensionDataInfo --- create TableDataInfo for extension config table
+ *
+ * This is used during binary upgrades to ensure extension configuration
+ * table data is dumped early (before user tables that may depend on it).
+ * For example, PostGIS's spatial_ref_sys must be populated before any
+ * table with geography(Point, 4283) can be created due to SRID validation.
+ */
+static void
+makeExtensionDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
+{
+	TableDataInfo *tdinfo;
+
+	/* Already have a data object? */
+	if (tbinfo->dataObj != NULL)
+		return;
+
+	/*
+	 * Caller ensures that this is only called for RELKIND_RELATION.
+	 */
+
+	/* OK, create the data object */
+	tdinfo = (TableDataInfo *) pg_malloc(sizeof(TableDataInfo));
+
+	tdinfo->dobj.objType = DO_EXTENSION_DATA;
+
+	tdinfo->dobj.catId.tableoid = 0;
+	tdinfo->dobj.catId.oid = tbinfo->dobj.catId.oid;
+	AssignDumpId(&tdinfo->dobj);
+	tdinfo->dobj.name = tbinfo->dobj.name;
+	tdinfo->dobj.namespace = tbinfo->dobj.namespace;
+	tdinfo->tdtable = tbinfo;
+	tdinfo->filtercond = NULL;
+	addObjectDependency(&tdinfo->dobj, tbinfo->dobj.dumpId);
+
+	/* Mark that this object contains data */
+	tdinfo->dobj.components |= DUMP_COMPONENT_DATA;
+
+	tbinfo->dataObj = tdinfo;
+	tbinfo->interesting = true;
+}
+
 /*
  * The refresh for a materialized view must be dependent on the refresh for
  * any materialized view that this one is dependent on.
@@ -11828,6 +11914,9 @@ dumpDumpableObject(Archive *fout, DumpableObject *dobj)
 		case DO_EXTENSION:
 			dumpExtension(fout, (const ExtensionInfo *) dobj);
 			break;
+		case DO_EXTENSION_DATA:
+			dumpExtensionData(fout, (const TableDataInfo *) dobj);
+			break;
 		case DO_TYPE:
 			dumpType(fout, (const TypeInfo *) dobj);
 			break;
@@ -20380,10 +20469,25 @@ processExtensionTables(Archive *fout, ExtensionInfo extinfo[],
 
 				if (dumpobj)
 				{
-					makeTableDataInfo(dopt, configtbl);
+					/*
+					 * For binary upgrades, dump extension config table data
+					 * before user tables are created so it's available for
+					 * validation (e.g. PostGIS SRIDs).
+					 */
+					if (dopt->binary_upgrade &&
+						configtbl->relkind == RELKIND_RELATION)
+						makeExtensionDataInfo(dopt, configtbl);
+					else
+						makeTableDataInfo(dopt, configtbl);
 					if (configtbl->dataObj != NULL)
 					{
-						if (strlen(extconditionarray[j]) > 0)
+						/*
+						 * For binary upgrade (DO_EXTENSION_DATA), don't apply
+						 * the filter condition - we need ALL data since the
+						 * extension won't populate built-in data in binary
+						 * upgrade mode.
+						 */
+						if (strlen(extconditionarray[j]) > 0 && !dopt->binary_upgrade)
 							configtbl->dataObj->filtercond = pg_strdup(extconditionarray[j]);
 					}
 				}
@@ -20661,6 +20765,7 @@ addBoundaryDependencies(DumpableObject **dobjs, int numObjs,
 		{
 			case DO_NAMESPACE:
 			case DO_EXTENSION:
+			case DO_EXTENSION_DATA:
 			case DO_TYPE:
 			case DO_SHELL_TYPE:
 			case DO_FUNC:
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 2b9c01b2c0a..185396a01a4 100644
--- a/src/bin/pg_dump/pg_dump.h
+++ b/src/bin/pg_dump/pg_dump.h
@@ -40,6 +40,7 @@ typedef enum
 	/* When modifying this enum, update priority tables in pg_dump_sort.c! */
 	DO_NAMESPACE,
 	DO_EXTENSION,
+	DO_EXTENSION_DATA,			/* extension config table data for binary upgrade */
 	DO_TYPE,
 	DO_SHELL_TYPE,
 	DO_FUNC,
diff --git a/src/bin/pg_dump/pg_dump_sort.c b/src/bin/pg_dump/pg_dump_sort.c
index 03e5c1c1116..3cced9c27be 100644
--- a/src/bin/pg_dump/pg_dump_sort.c
+++ b/src/bin/pg_dump/pg_dump_sort.c
@@ -58,6 +58,7 @@ enum dbObjectTypePriorities
 	PRIO_COLLATION,
 	PRIO_TRANSFORM,
 	PRIO_EXTENSION,
+	PRIO_EXTENSION_DATA,		/* ext config data: used for binary upgrade */
 	PRIO_TYPE,					/* used for DO_TYPE and DO_SHELL_TYPE */
 	PRIO_CAST,
 	PRIO_FUNC,
@@ -106,6 +107,7 @@ static const int dbObjectTypePriority[] =
 {
 	[DO_NAMESPACE] = PRIO_NAMESPACE,
 	[DO_EXTENSION] = PRIO_EXTENSION,
+	[DO_EXTENSION_DATA] = PRIO_EXTENSION_DATA,
 	[DO_TYPE] = PRIO_TYPE,
 	[DO_SHELL_TYPE] = PRIO_TYPE,
 	[DO_FUNC] = PRIO_FUNC,
@@ -1525,6 +1527,11 @@ describeDumpableObject(DumpableObject *obj, char *buf, int bufsize)
 					 "EXTENSION %s  (ID %d OID %u)",
 					 obj->name, obj->dumpId, obj->catId.oid);
 			return;
+		case DO_EXTENSION_DATA:
+			snprintf(buf, bufsize,
+					 "EXTENSION DATA %s  (ID %d OID %u)",
+					 obj->name, obj->dumpId, obj->catId.oid);
+			return;
 		case DO_TYPE:
 			snprintf(buf, bufsize,
 					 "TYPE %s  (ID %d OID %u)",
-- 
2.51.0

Reply via email to